Thermus thermophilus nucleic acid polymerases

ABSTRACT

The invention provides novel nucleic acid polymerases from strains GK24 and RQ-1 of  Thermus thermophilus , and nucleic acids encoding those polymerases, as well as methods for using the polymerases and nucleic acids.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a Divisional application of U.S. patent applicationSer. No. 14/656,585, filed Mar. 12, 2015; which is a Continuationapplication of U.S. patent application Ser. No. 13/770,252, filed Feb.19, 2013 (now U.S. Pat. No. 8,999,689); which is a Continuationapplication of U.S. patent application Ser. No. 12/905,008 filed Oct.14, 2010 (now U.S. Pat. No. 8,399,231); which is a Continuationapplication of U.S. patent application Ser. No. 12/193,691, filed Aug.18, 2008 (now abandoned); which is a Divisional application of U.S.patent application Ser. No. 11/609,174, filed Dec. 11, 2006 (now U.S.Pat. No. 7,422,872), which is a Divisional application of U.S. patentapplication Ser. No. 10/303,110, filed Nov. 22, 2002 (now U.S. Pat. No.7,148,340); all of which claim a priority benefit under 35 U.S.C.§119(e) from U.S. Provisional Patent Application No. 60/336,046, filedNov. 30, 2001, the disclosures of all of which are herein incorporatedby reference in their entireties.

FIELD OF THE INVENTION

The invention relates to nucleic acids and polypeptides for nucleic acidpolymerases from a thermophilic organism, Thermus thermophilus.

BACKGROUND OF THE INVENTION

DNA polymerases are naturally-occurring intracellular enzymes used by acell for replicating DNA by reading one nucleic acid strand andmanufacturing its complement. Enzymes having DNA polymerase activitycatalyze the formation of a bond between the 3′ hydroxyl group at thegrowing end of a nucleic acid primer and the 5′ phosphate group of anewly added nucleotide triphosphate. Nucleotide triphosphates used forDNA synthesis are usually deoxyadenosine triphosphate (A),deoxythymidine triphosphate (T), deoxycytosine triphosphate (C) anddeoxyguanosine triphosphate (G), but modified or altered versions ofthese nucleotides can also be used. The order in which the nucleotidesare added is dictated by hydrogen-bond formation between A and Tnucleotide bases and between G and C nucleotide bases.

Bacterial cells contain three types of DNA polymerases, termedpolymerase I, II and III. DNA polymerase I is the most abundantpolymerase and is generally responsible for certain types of DNA repair,including a repair-like reaction that permits the joining of Okazakifragments during DNA replication. Polymerase I is essential for therepair of DNA damage induced by UV irradiation and radiomimetic drugs.DNA Polymerase II is thought to play a role in repairing DNA damage thatinduces the SOS response. In mutants that lack both polymerase I andIII, polymerase II repairs UV-induced lesions. Polymerase I and II aremonomeric polymerases while polymerase III is a multisubunit complex.

Enzymes having DNA polymerase activity are often used in vitro for avariety of biochemical applications including cDNA synthesis and DNAsequencing reactions. See Sambrook e al., Molecular Cloning: ALaboratory Manual (3rd ed. Cold Spring Harbor Laboratory Press, 2001,hereby incorporated by reference. DNA polymerases are also used foramplification of nucleic acids by methods such as the polymerase chainreaction (PCR) (Mullis et al., U.S. Pat. Nos. 4,683,195, 4,683,202, and4,800,159, incorporated by reference) and RNA transcription-mediatedamplification methods (e.g., Kacian et al., PCT Publication No.WO91/01384, incorporated by reference).

DNA amplification utilizes cycles of primer extension through the use ofa DNA polymerase activity, followed by thermal denaturation of theresulting double-stranded nucleic acid in order to provide a newtemplate for another round of primer annealing and extension. Becausethe high temperatures necessary for strand denaturation result in theirreversible inactivations of many DNA polymerases, the discovery anduse of DNA polymerases able to remain active at temperatures above about37° C. provides an advantage in cost and labor efficiency.

Thermostable DNA polymerases have been discovered in a number ofthermophilic organisms including Thermus aquaticus, one strain ofThermus thermophilus, and certain species within the genera theBacillus, Thermococcus, Sulfobus, and Pyrococcus. A full lengththermostable DNA polymerase derived from Thermus aquaticus (Taq) hasbeen described by Lawyer, et al., J. Biol. Chem. 264:6427-6437 (1989)and Gelfand et al, U.S. Pat. No. 5,079,352. The cloning and expressionof truncated versions of that DNA polymerase are further described inLawyer et al., in PCR Methods and Applications, 2:275-287 (1993), andBarnes, PCT Publication No. WO92/06188 (1992). Sullivan reports thecloning of a mutated version of the Taq DNA polymerase in EPOPublication No. 0482714A1 (1992). A DNA polymerase from Thermusthermophilus has also been cloned and expressed. Asakura et al., J.Ferment. Bioeng. (Japan), 74:265-269 (1993). However, the properties ofthe various DNA polymerases vary. Accordingly, new DNA polymerases areneeded that have improved sequence discrimination, better salttolerance, varying degrees of thermostability, improved tolerance forlabeled or dideoxy nucleotides and other valuable properties.

SUMMARY OF THE INVENTION

The invention provides nucleic acid polymerase enzymes isolated from athermophilic organism, Thermus thermophilus. The invention providesnucleic acid polymerases from several Thermus thermophilus strains,including strain RQ-1, strain GK24 and strain 1b21. Therefore, in oneembodiment the invention provides an isolated nucleic acid encoding aThermus thermophilus strain RQ-1 (DSM catalog number 9247) nucleic acidpolymerase.

In another embodiment, the invention provides an isolated nucleic acidencoding a nucleic acid polymerase comprising any one of amino acidsequences SEQ ID NO:13-24.

In another embodiment, the invention provides an isolated nucleic acidencoding a derivative nucleic acid polymerase any one of amino acidsequences SEQ ID NO:13-15 having a mutation that decreases 5′-3′exonuclease activity. Such a derivative nucleic acid polymerase can havedecreased 5′-3′ exonuclease activity relative to a nucleic acidpolymerase comprising any one of amino acid sequences SEQ ID NO:13-15.

In another embodiment, the invention provides an isolated nucleic acidencoding a derivative nucleic acid polymerase comprising any one ofamino acid sequences SEQ ID NO:13-15 having a mutation that reducesdiscrimination against dideoxynucleotide triphosphates. Such aderivative nucleic acid polymerase can have reduced discriminationagainst dideoxynucleotide triphosphates relative to a nucleic acidpolymerase comprising any one of amino acid sequences SEQ ID NO:13-15.

The invention also provides an isolated nucleic acid encoding a nucleicpolymerase comprising any one of SEQ ID NO:1-12, and isolated nucleicacids complementary to any one of SEQ ID NO:1-12.

The invention also provides vectors comprising these isolated nucleicacids, including expression vectors comprising a promoter operablylinked to these isolated nucleic acids. Host cells comprising suchisolated nucleic acids and vectors are also provided by the invention,particularly host cells capable of expressing a thermostable polypeptideencoded by the nucleic acid, where the polypeptide has nucleic acidactivity and/or DNA polymerase activity.

The invention also provides isolated polypeptides that can include anyone of amino acid sequences SEQ ID NO:13-24. The isolated polypeptidesprovided by the invention can have any one of amino acid sequences SEQID NO:13-24, which can, for example, have a DNA polymerase activitybetween 50,000 U/mg protein and 500,000 U/mg protein.

In another embodiment, the invention provides an isolated derivativenucleic acid polymerase comprising any one of amino acid sequences SEQID NO:13-15 having a mutation that decreases 5′-3′ exonuclease activity.Such a derivative nucleic acid polymerase can have decreased 5′-3′exonuclease activity relative to a nucleic acid polymerase comprisingany one of amino acid sequences SEQ ID NO:13-15.

In another embodiment, the invention provides an isolated derivativenucleic acid polymerase comprising any one of amino acid sequences SEQID NO:13-15 having a mutation that reduces discrimination againstdideoxynucleotide triphosphates. Such a derivative nucleic acidpolymerase can have reduced discrimination against dideoxynucleotidetriphosphates relative to a nucleic acid polymerase comprising any oneof amino acid sequences SEQ ID NO:13-15.

The invention also provides a kit that includes a container containingat least one of the nucleic acid polymerases of the invention. Such anucleic acid polymerase can have an amino acid sequence comprising anyone of amino acid sequences SEQ ID NO:13-24. The kit can also contain anunlabeled nucleotide, a labeled nucleotide, a balanced mixture ofnucleotides, a chain terminating nucleotide, a nucleotide analog, abuffer solution, a solution containing magnesium, a cloning vector, arestriction endonuclease, a sequencing primer, a solution containingreverse transcriptase, or a DNA or RNA amplification primer. Such kitscan, for example, be adapted for performing DNA sequencing, DNAamplification, and RNA amplification or primer extension reactions.

The invention further provides a method of synthesizing a nucleic acidthat includes contacting a polypeptide comprising any one of amino acidsequences SEQ ID NO:13-24 with a nucleic acid under conditionssufficient to permit polymerization of the nucleic acid. Such a nucleicacid can be a DNA or an RNA.

The invention further provides a method for thermocyclic amplificationof nucleic acid that comprises contacting a nucleic acid with athermostable polypeptide having any one of amino acid sequences SEQ IDNO:13-24 under conditions suitable for amplification of the nucleicacid, and amplifying the nucleic acid. Such amplification can be, forexample, by Strand Displacement Amplification or Polymerase ChainReaction.

The invention also provides a method of primer extending DNA comprisingcontacting a polypeptide comprising any one of amino acid sequences SEQID NO:13-24 with a DNA under conditions sufficient to permitpolymerization of DNA. Such primer extension can be performed, forexample, to sequence DNA or to amplify DNA.

The invention further provides a method of making a nucleic acidpolymerase comprising any one of amino acid sequences SEQ ID NO:13-24,the method comprising incubating a host cell comprising a nucleic acidthat encodes a polypeptide comprising any one of amino acid sequencesSEQ ID NO:13-24, operably linked to a promoter under conditionssufficient for RNA transcription and translation. In one embodiment, themethod uses a nucleic acid that comprises any one of SEQ ID NO:1-12. Theinvention is also directed to a nucleic acid polymerase made by thismethod.

DESCRIPTION OF THE FIGURES

FIGS. 1A and 1B provide a comparison of amino acid sequences ofpolymerases from four strains of Thermus thermophilus: HB8, Z05, GK24(Kwon et al. 1997; Genebank accession number U62584) and the Thermusthermophilus strain GK24 polymerase of this invention (SEQ ID NO: 9).The four nonconservative differences between the Kwon GK24 amino acidsequence and SEQ ID NO:9 are shown in blue. Single amino acid changesamong the four strains are shown in red.

FIGS. 2A, 2B, and 2C provide a comparison of amino acid sequences fromfour different strains of Thermus thermophilus: HB8, Z05, GK24 (Genebankaccession number U62584) and RQ-1 (SEQ ID NO:10). The amino acidsequence of the wild-type polymerase from Thermus thermophilus strainRQ-1 has eight (8) changes from the sequence of Thermus thermophilusstrain HB8 and twenty-five (25) changes from the sequence of Thermusthermophilus strain ZO5 (U.S. Pat. No. 5,674,738).

FIG. 3 provides a comparison of amino acid sequences of polymerases fromtwo strains of Thermus thermophilus: HB8 and 1b21.

DETAILED DESCRIPTION OF THE INVENTION

The present invention relates to nucleic acid and amino acid sequencesencoding nucleic acid polymerases from thermophilic organisms. Inparticular, the present invention provides nucleic acid polymerases fromcertain strains Thermus thermophilus, including strains GK24, RQ-1 and1b21. The nucleic acid polymerases of the invention can be used in avariety of procedures, including DNA synthesis, reverse transcription,DNA primer extension, DNA sequencing and DNA amplification procedures.

DEFINITIONS

The term “amino acid sequence” refers to the positional arrangement andidentity of amino acids in a peptide, polypeptide or protein molecule.Use of the term “amino acid sequence” is not meant to limit the aminoacid sequence to the complete, native amino acid sequence of a peptide,polypeptide or protein.

“Chimeric” is used to indicate that a nucleic acid, such as a vector ora gene, is comprised of more than one nucleic acid segment and that atleast two nucleic acid segments are of distinct origin. Such nucleicacid segments are fused together by recombinant techniques resulting ina nucleic acid sequence, which does not occur naturally.

The term “coding region” refers to the nucleotide sequence that codesfor a protein of interest. The coding region of a protein is bounded onthe 5′ side by the nucleotide triplet “ATG” that encodes the initiatormethionine and on the 3′ side by one of the three triplets that specifystop codons (i.e., TAA, TAG, and TGA).

“Constitutive expression” refers to expression using a constitutivepromoter.

“Constitutive promoter” refers to a promoter that is able to express thegene that it controls in all, or nearly all, phases of the life cycle ofthe cell.

“Complementary” or “complementarity” are used to define the degree ofbase-pairing or hybridization between nucleic acids. For example, as isknown to one of skill in the art, adenine (A) can form hydrogen bonds orbase pair with thymine (T) and guanine (G) can form hydrogen bonds orbase pair with cytosine (C). Hence, A is complementary to T and G iscomplementary to C. Complementarity may be complete when all bases in adouble-stranded nucleic acid are base paired. Alternatively,complementarity may be “partial,” in which only some of the bases in anucleic acid are matched according to the base pairing rules. The degreeof complementarity between nucleic acid strands has an effect on theefficiency and strength of hybridization between nucleic acid strands.

The “derivative” of a reference nucleic acid, protein, polypeptide orpeptide, is a nucleic acid, protein, polypeptide or peptide,respectively, with a related but different sequence or chemicalstructure than the respective reference nucleic acid, protein,polypeptide or peptide. A derivative nucleic acid, protein, polypeptideor peptide is generally made purposefully to enhance or incorporate somechemical, physical or functional property that is absent or only weaklypresent in the reference nucleic acid, protein, polypeptide or peptide.A derivative nucleic acid generally can differ in nucleotide sequencefrom a reference nucleic acid whereas a derivative protein, polypeptideor peptide can differ in amino acid sequence from the reference protein,polypeptide or peptide, respectively. Such sequence differences can beone or more substitutions, insertions, additions, deletions, fusions andtruncations, which can be present in any combination. Differences can beminor (e.g., a difference of one nucleotide or amino acid) or moresubstantial. However, the sequence of the derivative is not so differentfrom the reference that one of skill in the art would not recognize thatthe derivative and reference are related in structure and/or function.Generally, differences are limited so that the reference and thederivative are closely similar overall and, in many regions, identical.A “variant” differs from a “derivative” nucleic acid, protein,polypeptide or peptide in that the variant can have silent structuraldifferences that do not significantly change the chemical, physical orfunctional properties of the reference nucleic acid, protein,polypeptide or peptide. In contrast, the differences between thereference and derivative nucleic acid, protein, polypeptide or peptideare intentional changes made to improve one or more chemical, physicalor functional properties of the reference nucleic acid, protein,polypeptide or peptide.

The terms “DNA polymerase activity,” “synthetic activity” and“polymerase activity” are used interchangeably and refer to the abilityof an enzyme to synthesize new DNA strands by the incorporation ofdeoxynucleoside triphosphates. A protein that can direct the synthesisof new DNA strands by the incorporation of deoxynucleoside triphosphatesin a template-dependent manner is said to be “capable of DNA syntheticactivity.”

The term “5′ exonuclease activity” refers to the presence of an activityin a protein that is capable of removing nucleotides from the 5′ end ofa nucleic acid.

The term “3′ exonuclease activity” refers to the presence of an activityin a protein that is capable of removing nucleotides from the 3′ end ofa nucleic acid.

“Expression” refers to the transcription and/or translation of anendogenous or exogeneous gene in an organism. Expression generallyrefers to the transcription and stable accumulation of mRNA. Expressionmay also refer to the production of protein.

“Expression cassette” means a nucleic acid sequence capable of directingexpression of a particular nucleotide sequence. Expression cassettesgenerally comprise a promoter operably linked to the nucleotide sequenceto be expressed (e.g., a coding region) that is operably linked totermination signals. Expression cassettes also typically comprisesequences required for proper translation of the nucleotide sequence.The expression cassette comprising the nucleotide sequence of interestmay be chimeric, meaning that at least one of its components isheterologous with respect to at least one of its other components. Theexpression of the nucleotide sequence in the expression cassette may beunder the control of a constitutive promoter or of an inducible promoterthat initiates transcription only when the host cell is exposed to someparticular external stimulus. In the case of a multicellular organism,the promoter can also be specific to a particular tissue or organ orstage of development.

The term “gene” is used broadly to refer to any segment of nucleic acidassociated with a biological function. The term “gene” encompasses thecoding region of a protein, polypeptide, peptide or structural RNA. Theterm “gene” also includes sequences up to a distance of about 2 kb oneither end of a coding region. These sequences are referred to as“flanking” sequences or regions (these flanking sequences are located 5′or 3′ to the non-translated sequences present on the mRNA transcript).The 5′ flanking region may contain regulatory sequences such aspromoters and enhancers or other recognition or binding sequences forproteins that control or influence the transcription of the gene. The 3′flanking region may contain sequences that direct the termination oftranscription, post-transcriptional cleavage and polyadenylation as wellas recognition sequences for other proteins. A protein or polypeptideencoded in a gene can be full length or any portion thereof, so that allactivities or functional properties are retained, or so that onlyselected activities (e.g., enzymatic activity, ligand binding, or signaltransduction) of the full-length protein or polypeptide are retained.The protein or polypeptide can include any sequences necessary for theproduction of a proprotein or precursor polypeptide. The term “nativegene” refers to gene that is naturally present in the genome of anuntransformed cell.

“Genome” refers to the complete genetic material that is naturallypresent in an organism and is transmitted from one generation to thenext.

The terms “heterologous nucleic acid,” or “exogenous nucleic acid” referto a nucleic acid that originates from a source foreign to theparticular host cell or, if from the same source, is modified from itsoriginal form. Thus, a heterologous gene in a host cell includes a genethat is endogenous to the particular host cell but has been modifiedthrough, for example, the use of DNA shuffling. The terms also includenon-naturally occurring multiple copies of a naturally occurring nucleicacid. Thus, the terms refer to a nucleic acid segment that is foreign orheterologous to the cell, or normally found within the cell but in aposition within the cell or genome where it is not ordinarily found.

The term “homology” refers to a degree of similarity between a nucleicacid and a reference nucleic acid or between a polypeptide and areference polypeptide. Homology may be partial or complete. Completehomology indicates that the nucleic acid or amino acid sequences areidentical. A partially homologous nucleic acid or amino acid sequence isone that is not identical to the reference nucleic acid or amino acidsequence. Hence, a partially homologous nucleic acid has one or morenucleotide differences in its sequence relative to the nucleic acid towhich it is being compared. The degree of homology can be determined bysequence comparison. Alternatively, as is understood by those skilled inthe art, DNA-DNA or DNA-RNA hybridization, under various hybridizationconditions, can provide an estimate of the degree of homology betweennucleic acids, (see, e.g., Haines and Higgins (eds.), Nucleic AcidHybridization, IRL Press, Oxford, U.K.).

“Hybridization” refers to the process of annealing complementary nucleicacid strands by forming hydrogen bonds between nucleotide bases on thecomplementary nucleic acid strands. Hybridization, and the strength ofthe association between the nucleic acids, is impacted by such factorsas the degree of complementary between the hybridizing nucleic acids,the stringency of the conditions involved, the T_(m) of the formedhybrid, and the G:C ratio within the nucleic acids.

“Inducible promoter” refers to a regulated promoter that can be turnedon in one or more cell types by an external stimulus, such as achemical, light, hormone, stress, temperature or a pathogen.

An “initiation site” is region surrounding the position of the firstnucleotide that is part of the transcribed sequence, which is defined asposition +1. All nucleotide positions of the gene are numbered byreference to the first nucleotide of the transcribed sequence, whichresides within the initiation site. Downstream sequences (i.e.,sequences in the 3′ direction) are denominated positive, while upstreamsequences (i.e., sequences in the 5′ direction) are denominatednegative.

An “isolated” or “purified” nucleic acid or an “isolated” or “purified”polypeptide is a nucleic acid or polypeptide that, by the hand of man,exists apart from its native environment and is therefore not a productof nature. An isolated nucleic acid or polypeptide may exist in apurified form or may exist in a non-native environment such as, forexample, a transgenic host cell.

The term “invader oligonucleotide” refers to an oligonucleotide thatcontains sequences at its 3′ end that are substantially the same assequences located at the 5′ end of a probe oligonucleotide. Theseregions will compete for hybridization to the same segment along acomplementary target nucleic acid.

The term “label” refers to any atom or molecule that can be used toprovide a detectable (preferably quantifiable) signal, and that can beattached to a nucleic acid or protein. Labels may provide signalsdetectable by fluorescence, radioactivity, colorimetry, gravimetry,X-ray diffraction or absorption, magnetism, enzymatic activity, and thelike.

The term “nucleic acid” refers to deoxyribonucleotides orribonucleotides and polymers thereof in either single- ordouble-stranded form, composed of monomers (nucleotides) containing asugar, phosphate and a base that is either a purine or pyrimidine.Unless specifically limited, the term encompasses nucleic acidscontaining known analogs of natural nucleotides that have similarbinding properties as the reference nucleic acid and are metabolized ina manner similar to naturally occurring nucleotides. Unless otherwiseindicated, a particular nucleic acid sequence also implicitlyencompasses conservatively modified variants thereof (e.g., degeneratecodon substitutions) and complementary sequences as well as thereference sequence explicitly indicated.

The term “oligonucleotide” as used herein is defined as a moleculecomprised of two or more deoxyribonucleotides or ribonucleotides,preferably more than three, and usually more than ten. There is noprecise upper limit on the size of an oligonucleotide. However, ingeneral, an oligonucleotide is shorter than about 250 nucleotides,preferably shorter than about 200 nucleotides and more preferablyshorter than about 100 nucleotides. The exact size will depend on manyfactors, which in turn depends on the ultimate function or use of theoligonucleotide. The oligonucleotide may be generated in any manner,including chemical synthesis, DNA replication, reverse transcription, ora combination thereof.

The terms “open reading frame” and “ORF” refer to the amino acidsequence encoded between translation initiation and termination codonsof a coding sequence. The terms “initiation codon” and “terminationcodon” refer to a unit of three adjacent nucleotides (‘codon’) in acoding sequence that specifies initiation and chain termination,respectively, of protein synthesis (mRNA translation).

“Operably linked” means joined as part of the same nucleic acidmolecule, so that the function of one is affected by the other. Ingeneral, “operably linked” also means that two or more nucleic acids aresuitably positioned and oriented so that they can function together.Nucleic acids are often operably linked to permit transcription of acoding region to be initiated from the promoter. For example, aregulatory sequence is said to be “operably linked to” or “associatedwith” a nucleic acid sequence that codes for an RNA or a polypeptide ifthe two sequences are situated such that the regulatory sequence affectsexpression of the coding region (i.e., that the coding sequence orfunctional RNA is under the transcriptional control of the promoter).Coding regions can be operably-linked to regulatory sequences in senseor antisense orientation.

The term “probe oligonucleotide” refers to an oligonucleotide thatinteracts with a target nucleic acid to form a cleavage structure in thepresence or absence of an invader oligonucleotide. When annealed to thetarget nucleic acid, the probe oligonucleotide and target form acleavage structure and cleavage occurs within the probe oligonucleotide.The presence of an invader oligonucleotide upstream of the probeoligonucleotide can shift the site of cleavage within the probeoligonucleotide (relative to the site of cleavage in the absence of theinvader).

“Promoter” refers to a nucleotide sequence, usually upstream (5′) to acoding region, which controls the expression of the coding region byproviding the recognition site for RNA polymerase and other factorsrequired for proper transcription. “Promoter” includes but is notlimited a minimal promoter that is a short DNA sequence comprised of aTATA-box. Hence, a promoter includes other sequences that serve tospecify the site of transcription initiation and control or regulateexpression, for example, enhancers. Accordingly, an “enhancer” is asegment of DNA that can stimulate promoter activity and may be an innateelement of the promoter or a heterologous element inserted to enhancethe level or tissue specificity of a promoter. It is capable ofoperating in both orientations (normal or flipped), and is capable offunctioning even when moved either upstream or downstream from thepromoter. Promoters may be derived in their entirety from a native gene,or be composed of different elements derived from different promotersfound in nature, or even be comprised of synthetic DNA segments. Apromoter may also contain DNA segments that are involved in the bindingof protein factors that control the effectiveness of transcriptioninitiation in response to physiological or developmental conditions.

The terms “protein,” “peptide” and “polypeptide” are usedinterchangeably herein.

“Regulatory sequences” and “regulatory elements” refer to nucleotidesequences that control some aspect of the expression of nucleic acidsequences. Such sequences or elements can be located upstream (5′non-coding sequences), within, or downstream (3′ non-coding sequences)of a coding sequence. “Regulatory sequences” and “regulatory elements”influence the transcription, RNA processing or stability, or translationof the associated coding sequence. Regulatory sequences includeenhancers, introns, promoters, polyadenylation signal sequences,splicing signals, termination signals, and translation leader sequences.They include natural and synthetic sequences.

As used herein, the term “selectable marker” refers to a gene thatencodes an observable or selectable trait that is expressed and can bedetected in an organism having that gene. Selectable markers are oftenlinked to a nucleic acid of interest that may not encode an observabletrait, in order to trace or select the presence of the nucleic acid ofinterest. Any selectable marker known to one of skill in the art can beused with the nucleic acids of the invention. Some selectable markersallow the host to survive under circumstances where, without the marker,the host would otherwise die. Examples of selectable markers includeantibiotic resistance, for example, tetracycline or ampicillinresistance.

As used herein the term “stringency” is used to define the conditions oftemperature, ionic strength, and the presence of other compounds such asorganic solvents, under which nucleic acid hybridizations are conducted.With “high stringency” conditions, nucleic acid base pairing will occuronly between nucleic acids that have a high frequency of complementarybase sequences. With “weak” or “low” stringency conditions nucleic acidsthe frequency of complementary sequences is usually less, so thatnucleic acids with differing sequences can be detected and/or isolated.

The terms “substantially similar” and “substantially homologous” referto nucleotide and amino acid sequences that represent functionalequivalents of the instant inventive sequences. For example, alterednucleotide sequences that simply reflect the degeneracy of the geneticcode but nonetheless encode amino acid sequences that are identical tothe inventive amino acid sequences are substantially similar to theinventive sequences. In addition, amino acid sequences that aresubstantially similar to the instant sequences are those wherein overallamino acid identity is sufficient to provide an active, thermally stablenucleic acid polymerase. For example, amino acid sequences that aresubstantially similar to the sequences of the invention are thosewherein the overall amino acid identity is 80% or greater, preferably90% or greater, such as 91%, 92%, 93%, or 94%, and more preferably 95%or greater, such as 96%, 97%, 98%, 99% relative to the amino acidsequences of the invention.

A “terminating agent,” “terminating nucleotide” or “terminator” inrelation to DNA synthesis or sequencing refers to compounds capable ofspecifically terminating a DNA sequencing reaction at a specific base,such compounds include but are not limited to, dideoxynucleosides havinga 2′, 3′ dideoxy structure (e.g., ddATP, ddCTP, ddGTP and ddTTP).

“Thermostable” means that a nucleic acid polymerase remains active at atemperature greater than about 37° C. Preferably, the nucleic acidpolymerases of the invention remain active at a temperature greater thanabout 42° C. More preferably, the nucleic acid polymerases of theinvention remain active at a temperature greater than about 50° C. Evenmore preferably, the nucleic acid polymerases of the invention remainactive after exposure to a temperature greater than about 60° C. Mostpreferably, the nucleic acid polymerases of the invention remain activedespite exposure to a temperature greater than about 70° C.

A “transgene” refers to a gene that has been introduced into the genomeby transformation and is stably maintained. Transgenes may include, forexample, genes that are either heterologous or homologous to the genesof a particular organism to be transformed. Additionally, transgenes maycomprise native genes inserted into a non-native organism, or chimericgenes. The term “endogenous gene” refers to a native gene in its naturallocation in the genome of an organism. A “foreign” or “exogenous” generefers to a gene not normally found in the host organism but one that isintroduced by gene transfer.

The term “transformation” refers to the transfer of a nucleic acidfragment into the genome of a host cell, resulting in genetically stableinheritance. Host cells containing the transformed nucleic acidfragments are referred to as “transgenic” cells, and organismscomprising transgenic cells are referred to as “transgenic organisms.”Transformation may be accomplished by a variety of means known to theart including calcium DNA co-precipitation, electroporation, viralinfection, and the like.

The “variant” of a reference nucleic acid, protein, polypeptide orpeptide, is a nucleic acid, protein, polypeptide or peptide,respectively, with a related but different sequence than the respectivereference nucleic acid, protein, polypeptide or peptide. The differencesbetween variant and reference nucleic acids, proteins, polypeptides orpeptides are silent or conservative differences. A variant nucleic aciddiffers in nucleotide sequence from a reference nucleic acid whereas avariant nucleic acid, protein, polypeptide or peptide differs in aminoacid sequence from the reference protein, polypeptide or peptide,respectively. A variant and reference nucleic acid, protein, polypeptideor peptide may differ in sequence by one or more substitutions,insertions, additions, deletions, fusions and truncations, which may bepresent in any combination. Differences can be minor (e.g., a differenceof one nucleotide or amino acid) or more substantial. However, thestructure and function of the variant is not so different from thereference that one of skill in the art would not recognize that thevariant and reference are related in structure and/or function.Generally, differences are limited so that the reference and the variantare closely similar overall and, in many regions, identical.

The term “vector” is used to refer to a nucleic acid that can transferanother nucleic acid segment(s) into a cell. A “vector” includes, interalia, any plasmid, cosmid, phage or nucleic acid in double- orsingle-stranded, linear or circular form that may or may not be selftransmissible or mobilizable. It can transform prokaryotic or eukaryotichost cells either by integration into the cellular genome or by existingextrachromosomally (e.g., autonomous replicating plasmid with an originof replication). Vectors used in bacterial systems often contain anorigin of replication that allows the vector to replicate independentlyof the bacterial chromosome. The term “expression vector” refers to avector containing an expression cassette.

The term “wild-type” refers to a gene or gene product that has thecharacteristics of that gene or gene product when isolated from anaturally occurring source. A wild-type gene is the gene form mostfrequently observed in a population and thus arbitrarily is designed the“normal” or “wild-type” form of the gene. In contrast, the term“variant” or “derivative” refers to a gene or gene product that displaysmodifications in sequence and or functional properties (i.e., alteredcharacteristics) when compared to the wild-type gene or gene product.Naturally-occurring derivatives can be isolated. They are identified bythe fact that they have altered characteristics when compared to thewild-type gene or gene product.

Polymerase Nucleic Acids

The invention provides isolated nucleic acids encoding Thermusthermophilus nucleic acid polymerases as well as derivatives fragmentsand variant nucleic acids thereof that encode an active, thermallystable nucleic acid polymerase. Thus, one aspect of the inventionincludes the nucleic acid polymerases encoded by the polynucleotidesequences contained in Thermus thermophilus strain RQ-1 from the GermanCollection of Microorganisms (DSM catalog number 9247). Another aspectof the invention provides nucleic acid polymerases from Thermusthermophilus strain GK24. While a DNA polymerase from of Thermusthermophilus strain GK24 has been cloned (Kwon et al., Mol Cells. 1997Apr. 30; 7 (2):264-71), the nucleic acid polymerases of Thermusthermophilus strain GK24 provided by the invention are distinct. Yetanother aspect of the invention provides nucleic acid polymerases fromThermus thermophilus strain 1b21. Accordingly, a nucleic acid encodingany one of amino acid sequences SEQ ID NO:13-24, which are amino acidsequences for wild type and several derivative Thermus thermophilusnucleic acid polymerases, are contemplated by the present invention.

In one embodiment, the invention provides a nucleic acid of SEQ ID NO:1,encoding a nucleic acid polymerase from a wild type Thermusthermophilus, strain GK24. SEQ ID NO:1 is provided below:

1 ATGGAGGCGA TGCTTCCGCT CTTTGAACCC AAAGGCCGGG TCCTCCTGGT 51 GGACGGCCACCACCTGGCCT ACCGCACCTT CTTCGCCCTG AAGGGCCTCA 101 CCACGAGCCG GGGCGAACCGGTGCAGGCGG TCTACGGCTT CGCCAAGAGC 151 CTCCTCAAGG CCCTGAAGGA GGACGGGTACAAGGCCGTCT TCGTGGTCTT 201 TGACGCCAAG GCCCCCTCCT TCCGCCACGA GGCCTACGAGGCCTACAAGG 251 CGGGGAGGGC CCCGACCCCC GAGGACTTCC CCCGGCAGCT CGCCCTCATC301 AAGGAGCTGG TGGACCTCCT GGGGTTTACC CGCCTCGAGG TCCCCGGCTA 351CGAGGCGGAC GACGTCCTCG CCACCCTGGC CAAGAAGGCG GAAAAGGAGG 401 GGTACGAGGTGCGCATCCTC ACCGCCGACC GCGACCTCTA CCAACTCGTC 451 TCCGACCGCG TCGCCGTCCTCCACCCCGAG GGCCACCTCA TCACCCCGGA 501 GTGGCTTTGG CAGAAGTACG GCCTCAAGCCGGAGCAGTGG GTGGACTTCC 551 GCGCCCTCGT GGGGGACCCC TCCGACAACC TCCCCGGGGTCAAGGGCATC 601 GGGGAGAAGA CCGCCCTCAA GCTCCTCAAG GAGTGGGGAA GCCTGGAAAA651 CCTCCTCAAG AACCTGGACC GGGTAAAGCC AGAAAACGTC CGGGAGAAGA 701TCAAGGCCCA CCTGGAAGAC CTCAGGCTTT CCTTGGAGCT CTCCCGGGTG 751 CGCACCGACCTCCCCCTGGA GGTGGACCTC GCCCAGGGGC GGGAGCCCGA 801 CCGGGAGGGG CTTAGGGCCTTCCTGGAGAG GCTGGAGTTC GGCAGCCTCC 851 TCCACGAGTT CGGCCTCCTG GAGGCCCCCGCCCCCCTGGA GGAGGCCCCC 901 TGGCCCCCGC CGGAAGGGGC CTTCGTGGGC TTCGTCCTCTCCCGCCCCGA 951 GCCCATGTGG GCGGAGCTTA AAGCCCTGGC CGCCTGCAGG GACGGCCGGG1001 TGCACCGGGC AGCGGACCCC TTGGCGGGGC TAAAGGACCT CAAGGAGGTC 1051CGGGGCCTCC TCGCCAAGGA CCTCGCCGTC TTGGCCTCGA GGGAGGGGCT 1101 AGACCTCGTGCCCGGGGACG ACCCCATGCT CCTCGCCTAC CTCCTGGACC 1151 CCTCCAACAC CACCCCCGAGGGGGTGGCGC GGCGCTACGG GGGGGAGTGG 1201 ACGGAGGACG CCGCCCACCG GGCCCTCCTCTCGGAGAGGC TCCATCGGAA 1251 CCTCCTTAAG CGCCTCCAGG GGGAGGAGAA GCTCCTTTGGCTCTACCACG 1301 AGGTGGAAAA GCCCCTCTCC CGGGTCCTGG CCCACATGGA GGCCACCGGG1351 GTACGGCTGG ACGTGGCCTA CCTGCAGGCC CTTTCCCTGG AGCTTGCGGA 1401GGAGATCCGC CGCCTCGAGG AGGAGGTCTT CCGCTTGGCG GGCCACCCCT 1451 TCAACCTCAACTCCCGGGAC CAGCTGGAGA GGGTGCTCTT TGACGAGCTT 1501 AGGCTTCCCG CCTTGGGGAAGACGCAAAAG ACGGGCAAGC GCTCCACCAG 1551 CGCCGCGGTG CTGGAGGCCC TACGGGAGGCCCACCCCATC GTGGAGAAGA 1601 TCCTCCAGCA CCGGGAGCTC ACCAAGCTCA AGAACACCTACGTGGACCCC 1651 CTCCCAAGCC TCGTCCACCC GAATACGGGC CGCCTCCACA CCCGCTTCAA1701 CCAGACGGCC ACGGCCACGG GGAGGCTTAG TAGCTCCGAC CCCAACCTGC 1751AGAACATCCC CGTCCGCACC CCCTTGGGCC AGAGGATCCG CCGGGCCTTC 1801 GTGGCCGAGGCGGGTTGGGC GTTGGTGGCC CTGGACTATA GCCAGATAGA 1851 GCTCCGCGTC CTCGCCCACCTCTCCGGGGA CGAGAACCTG ATCAGGGTCT 1901 TCCAGGAGGG GAAGGACATC CACACCCAGACCGCAAGCTG GATGTTCGGC 1951 GTCCCCCCGG AGGCCGTGGA TCCCCTGATG CGCCGGGCGGCCAAGACGGT 2001 GAACTTCGGC GTCCTCTACG GCATGTCCGC CCATAGGCTC TCCCAGGAGC2051 TTGCCATCCC CTACGAGGAG GCGGTGGCCT TTATAGAGCG CTACTTCCAA 2101AGCTTCCCCA AGGTGCGGGC CTGGATAGAA AAGACCCTGG AGGAGGGGAG 2151 GAAGCGGGGCTACGTGGAAA CCCTCTTCGG AAGAAGGCGC TACGTGCCCG 2201 ACCTCAACGC CCGGGTGAAGAGCGTCAGGG AGGCCGCGGA GCGCATGGCC 2251 TTCAACATGC CCGTCCAGGG CACCGCCGCCGACCTCATGA AGCTCGCCAT 2301 GGTGAAGCTC TTCCCCCGCC TCCGGGAGAT GGGGGCCCGCATGCTCCTCC 2351 AGGTCCACGA CGAGCTCCTC CTGGAGGCCC CCCAAGCGCG GGCCGAGGAG2401 GTGGCGGCTT TGGCCAAGGA GGCCATGGAG AAGGCCTATC CCCTCGCCGT 2451GCCCCTGGAG GTGGAGGTGG GGATGGGGGA GGACTGGCTT TCCGCCAAGG 2501 GTTAG

In another embodiment, the invention provides nucleic acids encoding awild type nucleic acid polymerase from Thermus thermophilus, strainRQ-1, having, for example, SEQ ID NO:2.

1 ATGGAGGCGA TGCTTCCGCT CTTTGAACCC AAAGGCCGGG TCCTCCTGGT 51 GGACGGCCACCACCTGGCCT ACCGCACCTT CTTCGCCCTG AAGGGCCTCA 101 CCACGAGCCG GGGCGAACCGGTGCAGGCGG TCTACGGCTT CGCCAAGAGC 151 CTCCTCAAGG CCCTGAAGGA GGACGGGTACAAGGCCGTCT TCGTGGTCTT 201 TGACGCCAAG GCCCCCTCCT TCCGCCACGA GGCCTACGAGGCCTACAAGG 251 CGGGGAGGGC CCCGACCCCC GAGGACTTCC CCCGGCAGCT CGCCCTCATC301 AAGGAGCTGG TGGACCTCTT GGGGTTTACT CGCCTCGAGG TCCCGGGCTT 351TGAGGCGGAC GACGTCCTCG CCACCCTGGC CAAGAAGGCG GAAAAAGAAG 401 GGTACGAGGTGCGCATCCTC ACCGCCGACC GGGACCTCTA CCAGCTCGTC 451 TCCGACCGGG TCGCCGTCCTCCACCCCGAG GGCCACCTCA TCACCCCGGA 501 GTGGCTTTGG GAGAAGTACG GCCTCAGGCCGGAGCAGTGG GTGGACTTCC 551 GCGCCCTCGT AGGGGACCCC TCCGACAACC TCCCCGGGGTCAAGGGCATC 601 GGGGAGAAGA CCGCCCTCAA GCTCCTTAAG GAGTGGGGAA GCCTGGAAAA651 CCTCCTCAAG AACCTGGACC GGGTGAAGCC GGAAAGCGTC CGGGAGAAGA 701TCAAGGCCCA CCTGGAAGAC CTCAGGCTCT CCTTGGAGCT CTCCCGGGTG 751 CGCACCGACCTCCCCCTGGA GGTGGACCTC GCCCAGGGGC GGGAGCCCGA 801 CCGGGAAGGG CTTAGGGCCTTCCTGGAGAG GCTAGAGTTC GGCAGCCTCC 851 TCCACGAGTT CGGCCTCCTG GAGGCCCCCGCCCCCCTGGA GGAGGCCCCC 901 TGGCCCCCGC CGGAAGGGGC CTTCGTGGGC TTCGTCCTCTCCCGCCCCGA 951 GCCCATGTGG GCGGAGCTTA AAGCCCTGGC CGCCTGCAGG GACGGCCGGG1001 TGCACCGGGC GGAGGACCCC TTGGCGGGGC TTAAGGACCT CAAGGAGGTC 1051CGGGGCCTCC TCGCCAAGGA CCTCGCCGTT TTGGCCTCGA GGGAGGGGCT 1101 AGACCTCGTGCCCGGGGACG ACCCCATGCT CCTCGCCTAC CTCCTGGACC 1151 CCTCCAACAC CACCCCCGAGGGGGTGGCGC GGCGCTACGG GGGGGAGTGG 1201 ACGGAGGACG CCGCCCAGCG GGCCCTCCTCTCGGAGAGGC TCCAGCAGAA 1251 CCTCCTTAAG CGCCTCCAGG GGGAGGAGAA GCTCCTCTGGCTCTACCACG 1301 AGGTGGAAAA GCCCCTCTCC CGGGTCCTGG CCCACATGGA GGCCACCGGG1351 GTACGGCTGG ACGTGGCCTA CCTTCAGGCC CTTTCCCTGG AGCTTGCGGA 1401GGAGATCCGC CGCCTCGAGG AGGAGGTCTT CCGCTTGGCG GGCCACCCCT 1451 TCAACCTCAACTCCCGGGAC CAGCTGGAAA GGGTGCTCTT TGACGAGCTT 1501 AGGCTTCCCG CCTTGGGGAAGACGCAAAAG ACGGGCAAGC GCTCCACCAG 1551 CGCCGCGGTG CTGGAGGCCC TACGGGAGGCCCACCCCATC GTGGAGAAGA 1601 TCCTCCAGCA CCGGGAGCTC ACCAAGCTCA AGAACACCTACGTGGACCCC 1651 CTCCCAAGCC TCGTCCACCC GAGGACGGGC CGCCTCCACA CCCGCTTCAA1701 CCAGACGGCC ACGGCCACGG GGAGGCTTAG TAGCTCCGAC CCCAACCTGC 1751AGAACATCCC CGTCCGCACC CCCTTGGGCC AGAGGATCCG CCGGGCCTTC 1801 GTAGCCGAGGCGGGATGGGC GTTGGTGGCC CTGGACTATA GCCAGATAGA 1851 GCTCCGCGTC CTCGCCCACCTCTCCGGGGA CGAGAACCTG ATCAGGGTCT 1901 TCCAGGAGGG GAAGGACATC CACACCCAGACCGCAAGCTG GATGTTCGGT 1951 GTCCCCCCGG AGGCCGTGGA CCCCCTGATG CGCCGGGCGGCCAAGACGGT 2001 GAACTTCGGC GTCCTCTACG GCATGTCCGC CCACCGGCTC TCCCAGGAGC2051 TTTCCATCCC CTACGAGGAG GCGGTGGCCT TTATAGAGCG CTACTTCCAA 2101AGCTTCCCCA AGGTGCGGGC CTGGATAGAA AAGACCCTGG AGGAGGGGAG 2151 GAAGCGGGGCTACGTGGAAA CCCTCTTCGG AAGAAGGCGC TACGTGCCCG 2201 ACCTCAACGC CCGGGTGAAGAGCGTCAGGG AGGCCGCGGA GCGCATGGCC 2251 TTCAACATGC CCGTCCAGGG CACCGCCGCCGACCTCATGA AGCTCGCCAT 2301 GGTGAAGCTC TTCCCCCGCC TCCGGGAGAT GGGGGCCCGCATGCTCCTCC 2351 AGGTCCACGA CGAGCTCCTC CTGGAGGCCC CCCAAGCGCG GGCCGAGGAG2401 GTGGCGGCTT TGGCCAAGGA GGCCATGGAG AAGGCCTATC CCCTCGCCGT 2451ACCCCTGGAG GTGGAGGTGG GGATCGGGGA GGACTGGCTT TCCGCCAAGG 2501 GCTAG

In another embodiment, the invention provides nucleic acids encoding awild type nucleic acid polymerase from Thermus thermophilus, strain1b21, having, for example, SEQ ID NO:3.

1 ATGGAGGCGA TGCTTCCGCT CTTTGAACCC AAAGGCCGGG TCCTCCTGGT 51 GGACGGCCACCACCTGGCCT ACCGCACCTT CTTCGCCCTG AAGGGCCTCA 101 CCACGAGCCG GGGCGAACCGGTGCAGGCGG TCTACGGCTT CGCCAAGAGC 151 CTCCTCAAGG CCCTGAAGGA GGACGGGTACAAGGCCGTCT TCGTGGTCTT 201 TGACGCCAAG GCCCCCTCCT TCCGCCACGA GGCCTACGAGGCCTACAAGG 251 CGGGGAGGGC CCCGACCCCC GAGGACTTCC CCCGGCAGCT CGCCCTCATC301 AAGGAGCTGG TGGACCTCCT GGGGTTTACC CGCCTCGAGG TCCCCGGCTA 351CGAGGCGGAC GACGTCCTCG CCACCCTGGC CAAGAAGGCG GAAAAGGAGG 401 GGTACGAGGTGCGCATCCTC ACCGCCGACC GCGACCTCTA CCAACTCGTC 451 TCCGACCGCG TCGCCGTCCTCCACCCCGAG GGCCACCTCA TCACCCCGGA 501 GTGGCTTTGG GAGAAGTACG GCCTCAAGCCGGAGCAGTGG GTGGACTTCC 551 GCGCCCTCGT GGGGGACCCC TCCGACAACC TCCCCGGGGTCAAGGGCATC 601 GGGGAGAAGA CCGCCCTCAA GCTCCTCAAG GAGTGGGGAA GCCTGGAAAA651 CCTCCTCAAG AACCTGGACC GGGTAAAGCC AGAAAACGTC CGGGAGAAGA 701TCAAGGCCCA CCTGGAAGAC CTCAGGCTTT CCTTGGAGCT CTCCCGGGTG 751 CGCACCGACCTCCCCCTGGA GGTGGACCTC GCCCAGGGGC GGGAGCCCGA 801 CCGGGAGGGG CTTAGGGCCTTCCTGGAGAG GCTGGAGTTC GGCAGCCTCC 851 TCCACGAGTT CGGCCTCCTG GAGGCCCCCGCCCCCCTGGA GGAGGCCCCC 901 TGGCCCCCGC CGGAAGGGGC CTTCGTGGGC TTCGTCCTCTCCCGCCCCGA 951 GCCCATGTGG GCGGAGCTTA AAGCCCTGGC CGCCTGCAGG GACGGCCGGG1001 TGCACCGGGC AGCAGACCCC TTGGCGGGGC TAAAGGACCT CAAGGAGGTC 1051CGGGGCCTCC TCGCCAAGGA CCTCGCCGTC TTGGCCTCGA GGGAGGGGCT 1101 AGACCTCGTGCCCGGGGACG ACCCCATGCT CCTCGCCTAC CTCCTGGACC 1151 CCTCCAACAC CACCCCCGAGGGGGTGGCGC GGCGCTACGG GGGGGAGTGG 1201 ACGGAGGACG CCGCCCACCG GGCCCTCCTCTCGGAGAGGC TCCATCGGAA 1251 CCTCCTTAAG CGCCTCGAGG GGGAGGAGAA GCTCCTTTGGCTCTACCACG 1301 AGGTGGAAAA GCCCCTCTCC CGGGTCCTGG CCCACATGGA GGCCACCGGG1351 GTACGGCTGG ACGTGGCCTA CCTTCAGGCC CTTTCCCTGG AGCTTGCGGA 1401GGAGATCCGC CGCCTCGAGG AGGAGGTCTT CCGCTTGGCG GGCCACCCCT 1451 TCAACCTCAACTCCCGGGAC CAGCTGGAAA GGGTGCTCTT TGACGAGCTT 1501 AGGCTTCCCG CCTTGGGGAAGACGCAAAAG ACGGGCAAGC GCTCCACCAG 1551 CGCCGCGGTG CTGGAGGCCC TACGGGAGGCCCACCCCATC GTGGAGAAGA 1601 TCCTCCAGCA CCGGGAGCTC ACCAAGCTCA AGAACACCTACGTGGACCCC 1651 CTCCCAAGCC TCGTCCACCC GAGGACGGGC CGCCTCCACA CCCGCTTCAA1701 CCAGACGGCC ACGGCCACGG GGAGGCTTAG TAGCTCCGAC CCCAACCTGC 1751AGAACATCCC CGTCCGCACC CCCTTGGGCC AGAGGATCCG CCGGGCCTTC 1801 GTGGCCGAGGCGGGATGGGC GTTGGTGGCC CTGGACTATA GCCAGATAGA 1851 GCTCCGCGTC CTCGCCCACCTCTCCGGGGA CGAGAACCTG ATCAGGGTCT 1901 TCCAGGAGGG GAAGGACATC CACACCCAGACCGCAAGCTG GATGTTCGGC 1951 GTCCCCCCGG AGGCCGTGGA CCCCCTGATG CGCCGGGCGGCCAAGACGGT 2001 GAACTTCGGC GTCCTCTACG GCATGTCCGC CCATAGGCTC TCCCAGGAGC2051 TTGCCATCCC CTACGAGGAG GCGGTGGCCT TTATAGAGCG CTACTTCCAA 2101AGCTTCCCCA AGGTGCGGGC CTGGATAGAA AAGACCCTGG AGGAGGGGAG 2151 AAAGCGGGGCTACGTGGAAA CCCTCTTCGG AAGAAGGCGC TACGTGCCCG 2201 ACCTCAACGC CCGGGTGAAGAGCGTCAGGG AGGCCGCGGA GCGCATGGCC 2251 TTCAACATGC CCGTCCAGGG CACCGCCGCCGACCTCATGA AGCTCGCCAT 2301 GGTGAAGCTC TTCCCCCGCC TCCGGGAGAT GGGGGCCCGCATGCTCCTCC 2351 AGGTCCACGA CGAGCTCCTC CTGGAGGCCC CCCAAGCGCG GGCCGAGGAG2401 GTGGCGGCTT TGGCCAAGGA GGCCATGGAG AAGGCCTATC CCCTCGCCGT 2451GCCCCTGGAG GTGGAGGTGG GGATGGGGGA GGACTGGCTT TCCGCCAAGG 2501 GTTAG

In another embodiment, the invention provides a nucleic acid of SEQ IDNO:4, a derivative nucleic acid related to Thermus thermophilus, strainGK24, having GAC (encoding Asp) in place of GGC (encoding Gly) atpositions 136-138. SEQ ID NO:4 is provided below.

1 ATGGAGGCGA TGCTTCCGCT CTTTGAACCC AAAGGCCGGG TCCTCCTGGT 51 GGACGGCCACCACCTGGCCT ACCGCACCTT CTTCGCCCTG AAGGGCCTCA 101 CCACGAGCCG GGGCGAACCGGTGCAGGCGG TCTAC GAC TT CGCCAAGAGC 151 CTCCTCAAGG CCCTGAAGGA GGACGGGTACAAGGCCGTCT TCGTGGTCTT 201 TGACGCCAAG GCCCCCTCCT TCCGCCACGA GGCCTACGAGGCCTACAAGG 251 CGGGGAGGGC CCCGACCCCC GAGGACTTCC CCCGGCAGCT CGCCCTCATC301 AAGGAGCTGG TGGACCTCCT GGGGTTTACC CGCCTCGAGG TCCCCGGCTA 351CGAGGCGGAC GACGTCCTCG CCACCCTGGC CAAGAAGGCG GAAAAGGAGG 401 GGTACGAGGTGCGCATCCTC ACCGCCGACC GCGACCTCTA CCAACTCGTC 451 TCCGACCGCG TCGCCGTCCTCCACCCCGAG GGCCACCTCA TCACCCCGGA 501 GTGGCTTTGG CAGAAGTACG GCCTCAAGCCGGAGCAGTGG GTGGACTTCC 551 GCGCCCTCGT GGGGGACCCC TCCGACAACC TCCCCGGGGTCAAGGGCATC 601 GGGGAGAAGA CCGCCCTCAA GCTCCTCAAG GAGTGGGGAA GCCTGGAAAA651 CCTCCTCAAG AACCTGGACC GGGTAAAGCC AGAAAACGTC CGGGAGAAGA 701TCAAGGCCCA CCTGGAAGAC CTCAGGCTTT CCTTGGAGCT CTCCCGGGTG 751 CGCACCGACCTCCCCCTGGA GGTGGACCTC GCCCAGGGGC GGGAGCCCGA 801 CCGGGAGGGG CTTAGGGCCTTCCTGGAGAG GCTGGAGTTC GGCAGCCTCC 851 TCCACGAGTT CGGCCTCCTG GAGGCCCCCGCCCCCCTGGA GGAGGCCCCC 901 TGGCCCCCGC CGGAAGGGGC CTTCGTGGGC TTCGTCCTCTCCCGCCCCGA 951 GCCCATGTGG GCGGAGCTTA AAGCCCTGGC CGCCTGCAGG GACGGCCGGG1001 TGCACCGGGC AGCGGACCCC TTGGCGGGGC TAAAGGACCT CAAGGAGGTC 1051CGGGGCCTCC TCGCCAAGGA CCTCGCCGTC TTGGCCTCGA GGGAGGGGCT 1101 AGACCTCGTGCCCGGGGACG ACCCCATGCT CCTCGCCTAC CTCCTGGACC 1151 CCTCCAACAC CACCCCCGAGGGGGTGGCGC GGCGCTACGG GGGGGAGTGG 1201 ACGGAGGACG CCGCCCACCG GGCCCTCCTCTCGGAGAGGC TCCATCGGAA 1251 CCTCCTTAAG CGCCTCCAGG GGGAGGAGAA GCTCCTTTGGCTCTACCACG 1301 AGGTGGAAAA GCCCCTCTCC CGGGTCCTGG CCCACATGGA GGCCACCGGG1351 GTACGGCTGG ACGTGGCCTA CCTGCAGGCC CTTTCCCTGG AGCTTGCGGA 1401GGAGATCCGC CGCCTCGAGG AGGAGGTCTT CCGCTTGGCG GGCCACCCCT 1451 TCAACCTCAACTCCCGGGAC CAGCTGGAGA GGGTGCTCTT TGACGAGCTT 1501 AGGCTTCCCG CCTTGGGGAAGACGCAAAAG ACGGGCAAGC GCTCCACCAG 1551 CGCCGCGGTG CTGGAGGCCC TACGGGAGGCCCACCCCATC GTGGAGAAGA 1601 TCCTCCAGCA CCGGGAGCTC ACCAAGCTCA AGAACACCTACGTGGACCCC 1651 CTCCCAAGCC TCGTCCACCC GAATACGGGC CGCCTCCACA CCCGCTTCAA1701 CCAGACGGCC ACGGCCACGG GGAGGCTTAG TAGCTCCGAC CCCAACCTGC 1751AGAACATCCC CGTCCGCACC CCCTTGGGCC AGAGGATCCG CCGGGCCTTC 1801 GTGGCCGAGGCGGGTTGGGC GTTGGTGGCC CTGGACTATA GCCAGATAGA 1851 GCTCCGCGTC CTCGCCCACCTCTCCGGGGA CGAGAACCTG ATCAGGGTCT 1901 TCCAGGAGGG GAAGGACATC CACACCCAGACCGCAAGCTG GATGTTCGGC 1951 GTCCCCCCGG AGGCCGTGGA TCCCCTGATG CGCCGGGCGGCCAAGACGGT 2001 GAACTTCGGC GTCCTCTACG GCATGTCCGC CCATAGGCTC TCCCAGGAGC2051 TTGCCATCCC CTACGAGGAG GCGGTGGCCT TTATAGAGCG CTACTTCCAA 2101AGCTTCCCCA AGGTGCGGGC CTGGATAGAA AAGACCCTGG AGGAGGGGAG 2151 GAAGCGGGGCTACGTGGAAA CCCTCTTCGG AAGAAGGCGC TACGTGCCCG 2201 ACCTCAACGC CCGGGTGAAGAGCGTCAGGG AGGCCGCGGA GCGCATGGCC 2251 TTCAACATGC CCGTCCAGGG CACCGCCGCCGACCTCATGA AGCTCGCCAT 2301 GGTGAAGCTC TTCCCCCGCC TCCGGGAGAT GGGGGCCCGCATGCTCCTCC 2351 AGGTCCACGA CGAGCTCCTC CTGGAGGCCC CCCAAGCGCG GGCCGAGGAG2401 GTGGCGGCTT TGGCCAAGGA GGCCATGGAG AAGGCCTATC CCCTCGCCGT 2451GCCCCTGGAG GTGGAGGTGG GGATGGGGGA GGACTGGCTT TCCGCCAAGG 2501 GTTAG

In another embodiment, the invention provides a nucleic acid of SEQ IDNO:5, a derivative nucleic acid related to Thermus thermophilus, strainRQ-1, having GAC (encoding Asp) in place of GGC (encoding Gly) atpositions 136-138. SEQ ID NO:5 is provided below.

1 ATGGAGGCGA TGCTTCCGCT CTTTGAACCC AAAGGCCGGG TCCTCCTGGT 51 GGACGGCCACCACCTGGCCT ACCGCACCTT CTTCGCCCTG AAGGGCCTCA 101 CCACGAGCCG GGGCGAACCGGTGCAGGCGG TCTAC GAC TT CGCCAAGAGC 151 CTCCTCAAGG CCCTGAAGGA GGACGGGTACAAGGCCGTCT TCGTGGTCTT 201 TGACGCCAAG GCCCCCTCCT TCCGCCACGA GGCCTACGAGGCCTACAAGG 251 CGGGGAGGGC CCCGACCCCC GAGGACTTCC CCCGGCAGCT CGCCCTCATC301 AAGGAGCTGG TGGACCTCTT GGGGTTTACT CGCCTCGAGG TCCCGGGCTT 351TGAGGCGGAC GACGTCCTCG CCACCCTGGC CAAGAAGGCG GAAAAAGAAG 401 GGTACGAGGTGCGCATCCTC ACCGCCGACC GGGACCTCTA CCAGCTCGTC 451 TCCGACCGGG TCGCCGTCCTCCACCCCGAG GGCCACCTCA TCACCCCGGA 501 GTGGCTTTGG GAGAAGTACG GCCTCAGGCCGGAGCAGTGG GTGGACTTCC 551 GCGCCCTCGT AGGGGACCCC TCCGACAACC TCCCCGGGGTCAAGGGCATC 601 GGGGAGAAGA CCGCCCTCAA GCTCCTTAAG GAGTGGGGAA GCCTGGAAAA651 CCTCCTCAAG AACCTGGACC GGGTGAAGCC GGAAAGCGTC CGGGAGAAGA 701TCAAGGCCCA CCTGGAAGAC CTCAGGCTCT CCTTGGAGCT CTCCCGGGTG 751 CGCACCGACCTCCCCCTGGA GGTGGACCTC GCCCAGGGGC GGGAGCCCGA 801 CCGGGAAGGG CTTAGGGCCTTCCTGGAGAG GCTAGAGTTC GGCAGCCTCC 851 TCCACGAGTT CGGCCTCCTG GAGGCCCCCGCCCCCCTGGA GGAGGCCCCC 901 TGGCCCCCGC CGGAAGGGGC CTTCGTGGGC TTCGTCCTCTCCCGCCCCGA 951 GCCCATGTGG GCGGAGCTTA AAGCCCTGGC CGCCTGCAGG GACGGCCGGG1001 TGCACCGGGC GGAGGACCCC TTGGCGGGGC TTAAGGACCT CAAGGAGGTC 1051CGGGGCCTCC TCGCCAAGGA CCTCGCCGTT TTGGCCTCGA GGGAGGGGCT 1101 AGACCTCGTGCCCGGGGACG ACCCCATGCT CCTCGCCTAC CTCCTGGACC 1151 CCTCCAACAC CACCCCCGAGGGGGTGGCGC GGCGCTACGG GGGGGAGTGG 1201 ACGGAGGACG CCGCCCAGCG GGCCCTCCTCTCGGAGAGGC TCCAGCAGAA 1251 CCTCCTTAAG CGCCTCCAGG GGGAGGAGAA GCTCCTCTGGCTCTACCACG 1301 AGGTGGAAAA GCCCCTCTCC CGGGTCCTGG CCCACATGGA GGCCACCGGG1351 GTACGGCTGG ACGTGGCCTA CCTTCAGGCC CTTTCCCTGG AGCTTGCGGA 1401GGAGATCCGC CGCCTCGAGG AGGAGGTCTT CCGCTTGGCG GGCCACCCCT 1451 TCAACCTCAACTCCCGGGAC CAGCTGGAAA GGGTGCTCTT TGACGAGCTT 1501 AGGCTTCCCG CCTTGGGGAAGACGCAAAAG ACGGGCAAGC GCTCCACCAG 1551 CGCCGCGGTG CTGGAGGCCC TACGGGAGGCCCACCCCATC GTGGAGAAGA 1601 TCCTCCAGCA CCGGGAGCTC ACCAAGCTCA AGAACACCTACGTGGACCCC 1651 CTCCCAAGCC TCGTCCACCC GAGGACGGGC CGCCTCCACA CCCGCTTCAA1701 CCAGACGGCC ACGGCCACGG GGAGGCTTAG TAGCTCCGAC CCCAACCTGC 1751AGAACATCCC CGTCCGCACC CCCTTGGGCC AGAGGATCCG CCGGGCCTTC 1801 GTAGCCGAGGCGGGATGGGC GTTGGTGGCC CTGGACTATA GCCAGATAGA 1851 GCTCCGCGTC CTCGCCCACCTCTCCGGGGA CGAGAACCTG ATCAGGGTCT 1901 TCCAGGAGGG GAAGGACATC CACACCCAGACCGCAAGCTG GATGTTCGGT 1951 GTCCCCCCGG AGGCCGTGGA CCCCCTGATG CGCCGGGCGGCCAAGACGGT 2001 GAACTTCGGC GTCCTCTACG GCATGTCCGC CCACCGGCTC TCCCAGGAGC2051 TTTCCATCCC CTACGAGGAG GCGGTGGCCT TTATAGAGCG CTACTTCCAA 2101AGCTTCCCCA AGGTGCGGGC CTGGATAGAA AAGACCCTGG AGGAGGGGAG 2151 GAAGCGGGGCTACGTGGAAA CCCTCTTCGG AAGAAGGCGC TACGTGCCCG 2201 ACCTCAACGC CCGGGTGAAGAGCGTCAGGG AGGCCGCGGA GCGCATGGCC 2251 TTCAACATGC CCGTCCAGGG CACCGCCGCCGACCTCATGA AGCTCGCCAT 2301 GGTGAAGCTC TTCCCCCGCC TCCGGGAGAT GGGGGCCCGCATGCTCCTCC 2351 AGGTCCACGA CGAGCTCCTC CTGGAGGCCC CCCAAGCGCG GGCCGAGGAG2401 GTGGCGGCTT TGGCCAAGGA GGCCATGGAG AAGGCCTATC CCCTCGCCGT 2451ACCCCTGGAG GTGGAGGTGG GGATCGGGGA GGACTGGCTT TCCGCCAAGG 2501 GCTAG

In another embodiment, the invention provides a nucleic acid of SEQ IDNO:6, a derivative nucleic acid related to Thermus thermophilus, strain1b21, having GAC (encoding Asp) in place of GGC (encoding Gly) atpositions 136-138. SEQ ID NO:6 is provided below

1 ATGGAGGCGA TGCTTCCGCT CTTTGAACCC AAAGGCCGGG TCCTCCTGGT 51 GGACGGCCACCACCTGGCCT ACCGCACCTT CTTCGCCCTG AAGGGCCTCA 101 CCACGAGCCG GGGCGAACCGGTGCAGGCGG TCTAC GAC TT CGCCAAGAGC 151 CTCCTCAAGG CCCTGAAGGA GGACGGGTACAAGGCCGTCT TCGTGGTCTT 201 TGACGCCAAG GCCCCCTCCT TCCGCCACGA GGCCTACGAGGCCTACAAGG 251 CGGGGAGGGC CCCGACCCCC GAGGACTTCC CCCGGCAGCT CGCCCTCATC301 AAGGAGCTGG TGGACCTCCT GGGGTTTACC CGCCTCGAGG TCCCCGGCTA 351CGAGGCGGAC GACGTCCTCG CCACCCTGGC CAAGAAGGCG GAAAAGGAGG 401 GGTACGAGGTGCGCATCCTC ACCGCCGACC GCGACCTCTA CCAACTCGTC 451 TCCGACCGCG TCGCCGTCCTCCACCCCGAG GGCCACCTCA TCACCCCGGA 501 GTGGCTTTGG GAGAAGTACG GCCTCAAGCCGGAGCAGTGG GTGGACTTCC 551 GCGCCCTCGT GGGGGACCCC TCCGACAACC TCCCCGGGGTCAAGGGCATC 601 GGGGAGAAGA CCGCCCTCAA GCTCCTCAAG GAGTGGGGAA GCCTGGAAAA651 CCTCCTCAAG AACCTGGACC GGGTAAAGCC AGAAAACGTC CGGGAGAAGA 701TCAAGGCCCA CCTGGAAGAC CTCAGGCTTT CCTTGGAGCT CTCCCGGGTG 751 CGCACCGACCTCCCCCTGGA GGTGGACCTC GCCCAGGGGC GGGAGCCCGA 801 CCGGGAGGGG CTTAGGGCCTTCCTGGAGAG GCTGGAGTTC GGCAGCCTCC 851 TCCACGAGTT CGGCCTCCTG GAGGCCCCCGCCCCCCTGGA GGAGGCCCCC 901 TGGCCCCCGC CGGAAGGGGC CTTCGTGGGC TTCGTCCTCTCCCGCCCCGA 951 GCCCATGTGG GCGGAGCTTA AAGCCCTGGC CGCCTGCAGG GACGGCCGGG1001 TGCACCGGGC AGCAGACCCC TTGGCGGGGC TAAAGGACCT CAAGGAGGTC 1051CGGGGCCTCC TCGCCAAGGA CCTCGCCGTC TTGGCCTCGA GGGAGGGGCT 1101 AGACCTCGTGCCCGGGGACG ACCCCATGCT CCTCGCCTAC CTCCTGGACC 1151 CCTCCAACAC CACCCCCGAGGGGGTGGCGC GGCGCTACGG GGGGGAGTGG 1201 ACGGAGGACG CCGCCCACCG GGCCCTCCTCTCGGAGAGGC TCCATCGGAA 1251 CCTCCTTAAG CGCCTCGAGG GGGAGGAGAA GCTCCTTTGGCTCTACCACG 1301 AGGTGGAAAA GCCCCTCTCC CGGGTCCTGG CCCACATGGA GGCCACCGGG1351 GTACGGCTGG ACGTGGCCTA CCTTCAGGCC CTTTCCCTGG AGCTTGCGGA 1401GGAGATCCGC CGCCTCGAGG AGGAGGTCTT CCGCTTGGCG GGCCACCCCT 1451 TCAACCTCAACTCCCGGGAC CAGCTGGAAA GGGTGCTCTT TGACGAGCTT 1501 AGGCTTCCCG CCTTGGGGAAGACGCAAAAG ACGGGCAAGC GCTCCACCAG 1551 CGCCGCGGTG CTGGAGGCCC TACGGGAGGCCCACCCCATC GTGGAGAAGA 1601 TCCTCCAGCA CCGGGAGCTC ACCAAGCTCA AGAACACCTACGTGGACCCC 1651 CTCCCAAGCC TCGTCCACCC GAGGACGGGC CGCCTCCACA CCCGCTTCAA1701 CCAGACGGCC ACGGCCACGG GGAGGCTTAG TAGCTCCGAC CCCAACCTGC 1751AGAACATCCC CGTCCGCACC CCCTTGGGCC AGAGGATCCG CCGGGCCTTC 1801 GTGGCCGAGGCGGGATGGGC GTTGGTGGCC CTGGACTATA GCCAGATAGA 1851 GCTCCGCGTC CTCGCCCACCTCTCCGGGGA CGAGAACCTG ATCAGGGTCT 1901 TCCAGGAGGG GAAGGACATC CACACCCAGACCGCAAGCTG GATGTTCGGC 1951 GTCCCCCCGG AGGCCGTGGA CCCCCTGATG CGCCGGGCGGCCAAGACGGT 2001 GAACTTCGGC GTCCTCTACG GCATGTCCGC CCATAGGCTC TCCCAGGAGC2051 TTGCCATCCC CTACGAGGAG GCGGTGGCCT TTATAGAGCG CTACTTCCAA 2101AGCTTCCCCA AGGTGCGGGC CTGGATAGAA AAGACCCTGG AGGAGGGGAG 2151 AAAGCGGGGCTACGTGGAAA CCCTCTTCGG AAGAAGGCGC TACGTGCCCG 2201 ACCTCAACGC CCGGGTGAAGAGCGTCAGGG AGGCCGCGGA GCGCATGGCC 2251 TTCAACATGC CCGTCCAGGG CACCGCCGCCGACCTCATGA AGCTCGCCAT 2301 GGTGAAGCTC TTCCCCCGCC TCCGGGAGAT GGGGGCCCGCATGCTCCTCC 2351 AGGTCCACGA CGAGCTCCTC CTGGAGGCCC CCCAAGCGCG GGCCGAGGAG2401 GTGGCGGCTT TGGCCAAGGA GGCCATGGAG AAGGCCTATC CCCTCGCCGT 2451GCCCCTGGAG GTGGAGGTGG GGATGGGGGA GGACTGGCTT TCCGCCAAGG 2501 GTTAG

In another embodiment, the invention provides a nucleic acid of SEQ IDNO:7, a derivative nucleic acid related to Thermus thermophilus, strainGK24, having TAC (encoding Tyr) in place of TTC (encoding Phe) atpositions 2005-07. SEQ ID NO:7 is provided below:

1 ATGGAGGCGA TGCTTCCGCT CTTTGAACCC AAAGGCCGGG TCCTCCTGGT 51 GGACGGCCACCACCTGGCCT ACCGCACCTT CTTCGCCCTG AAGGGCCTCA 101 CCACGAGCCG GGGCGAACCGGTGCAGGCGG TCTACGGCTT CGCCAAGAGC 151 CTCCTCAAGG CCCTGAAGGA GGACGGGTACAAGGCCGTCT TCGTGGTCTT 201 TGACGCCAAG GCCCCCTCCT TCCGCCACGA GGCCTACGAGGCCTACAAGG 251 CGGGGAGGGC CCCGACCCCC GAGGACTTCC CCCGGCAGCT CGCCCTCATC301 AAGGAGCTGG TGGACCTCCT GGGGTTTACC CGCCTCGAGG TCCCCGGCTA 351CGAGGCGGAC GACGTCCTCG CCACCCTGGC CAAGAAGGCG GAAAAGGAGG 401 GGTACGAGGTGCGCATCCTC ACCGCCGACC GCGACCTCTA CCAACTCGTC 451 TCCGACCGCG TCGCCGTCCTCCACCCCGAG GGCCACCTCA TCACCCCGGA 501 GTGGCTTTGG CAGAAGTACG GCCTCAAGCCGGAGCAGTGG GTGGACTTCC 551 GCGCCCTCGT GGGGGACCCC TCCGACAACC TCCCCGGGGTCAAGGGCATC 601 GGGGAGAAGA CCGCCCTCAA GCTCCTCAAG GAGTGGGGAA GCCTGGAAAA651 CCTCCTCAAG AACCTGGACC GGGTAAAGCC AGAAAACGTC CGGGAGAAGA 701TCAAGGCCCA CCTGGAAGAC CTCAGGCTTT CCTTGGAGCT CTCCCGGGTG 751 CGCACCGACCTCCCCCTGGA GGTGGACCTC GCCCAGGGGC GGGAGCCCGA 801 CCGGGAGGGG CTTAGGGCCTTCCTGGAGAG GCTGGAGTTC GGCAGCCTCC 851 TCCACGAGTT CGGCCTCCTG GAGGCCCCCGCCCCCCTGGA GGAGGCCCCC 901 TGGCCCCCGC CGGAAGGGGC CTTCGTGGGC TTCGTCCTCTCCCGCCCCGA 951 GCCCATGTGG GCGGAGCTTA AAGCCCTGGC CGCCTGCAGG GACGGCCGGG1001 TGCACCGGGC AGCGGACCCC TTGGCGGGGC TAAAGGACCT CAAGGAGGTC 1051CGGGGCCTCC TCGCCAAGGA CCTCGCCGTC TTGGCCTCGA GGGAGGGGCT 1101 AGACCTCGTGCCCGGGGACG ACCCCATGCT CCTCGCCTAC CTCCTGGACC 1151 CCTCCAACAC CACCCCCGAGGGGGTGGCGC GGCGCTACGG GGGGGAGTGG 1201 ACGGAGGACG CCGCCCACCG GGCCCTCCTCTCGGAGAGGC TCCATCGGAA 1251 CCTCCTTAAG CGCCTCCAGG GGGAGGAGAA GCTCCTTTGGCTCTACCACG 1301 AGGTGGAAAA GCCCCTCTCC CGGGTCCTGG CCCACATGGA GGCCACCGGG1351 GTACGGCTGG ACGTGGCCTA CCTGCAGGCC CTTTCCCTGG AGCTTGCGGA 1401GGAGATCCGC CGCCTCGAGG AGGAGGTCTT CCGCTTGGCG GGCCACCCCT 1451 TCAACCTCAACTCCCGGGAC CAGCTGGAGA GGGTGCTCTT TGACGAGCTT 1501 AGGCTTCCCG CCTTGGGGAAGACGCAAAAG ACGGGCAAGC GCTCCACCAG 1551 CGCCGCGGTG CTGGAGGCCC TACGGGAGGCCCACCCCATC GTGGAGAAGA 1601 TCCTCCAGCA CCGGGAGCTC ACCAAGCTCA AGAACACCTACGTGGACCCC 1651 CTCCCAAGCC TCGTCCACCC GAATACGGGC CGCCTCCACA CCCGCTTCAA1701 CCAGACGGCC ACGGCCACGG GGAGGCTTAG TAGCTCCGAC CCCAACCTGC 1751AGAACATCCC CGTCCGCACC CCCTTGGGCC AGAGGATCCG CCGGGCCTTC 1801 GTGGCCGAGGCGGGTTGGGC GTTGGTGGCC CTGGACTATA GCCAGATAGA 1851 GCTCCGCGTC CTCGCCCACCTCTCCGGGGA CGAGAACCTG ATCAGGGTCT 1901 TCCAGGAGGG GAAGGACATC CACACCCAGACCGCAAGCTG GATGTTCGGC 1951 GTCCCCCCGG AGGCCGTGGA TCCCCTGATG CGCCGGGCGGCCAAGACGGT 2001 GAAC TAC GGC GTCCTCTACG GCATGTCCGC CCATAGGCTC TCCCAGGAGC2051 TTGCCATCCC CTACGAGGAG GCGGTGGCCT TTATAGAGCG CTACTTCCAA 2101AGCTTCCCCA AGGTGCGGGC CTGGATAGAA AAGACCCTGG AGGAGGGGAG 2151 GAAGCGGGGCTACGTGGAAA CCCTCTTCGG AAGAAGGCGC TACGTGCCCG 2201 ACCTCAACGC CCGGGTGAAGAGCGTCAGGG AGGCCGCGGA GCGCATGGCC 2251 TTCAACATGC CCGTCCAGGG CACCGCCGCCGACCTCATGA AGCTCGCCAT 2301 GGTGAAGCTC TTCCCCCGCC TCCGGGAGAT GGGGGCCCGCATGCTCCTCC 2351 AGGTCCACGA CGAGCTCCTC CTGGAGGCCC CCCAAGCGCG GGCCGAGGAG2401 GTGGCGGCTT TGGCCAAGGA GGCCATGGAG AAGGCCTATC CCCTCGCCGT 2451GCCCCTGGAG GTGGAGGTGG GGATGGGGGA GGACTGGCTT TCCGCCAAGG 2501 GTTAG

In another embodiment, the invention provides a nucleic acid of SEQ IDNO:8, a derivative nucleic acid related to Thermus thermophilus, strainRQ-1, having TAC (encoding Tyr) in place of TTC (encoding Phe) atpositions 2005-07. SEQ ID NO:8 is provided below:

1 ATGGAGGCGA TGCTTCCGCT CTTTGAACCC AAAGGCCGGG TCCTCCTGGT 51 GGACGGCCACCACCTGGCCT ACCGCACCTT CTTCGCCCTG AAGGGCCTCA 101 CCACGAGCCG GGGCGAACCGGTGCAGGCGG TCTACGGCTT CGCCAAGAGC 151 CTCCTCAAGG CCCTGAAGGA GGACGGGTACAAGGCCGTCT TCGTGGTCTT 201 TGACGCCAAG GCCCCCTCCT TCCGCCACGA GGCCTACGAGGCCTACAAGG 251 CGGGGAGGGC CCCGACCCCC GAGGACTTCC CCCGGCAGCT CGCCCTCATC301 AAGGAGCTGG TGGACCTCTT GGGGTTTACT CGCCTCGAGG TCCCGGGCTT 351TGAGGCGGAC GACGTCCTCG CCACCCTGGC CAAGAAGGCG GAAAAAGAAG 401 GGTACGAGGTGCGCATCCTC ACCGCCGACC GGGACCTCTA CCAGCTCGTC 451 TCCGACCGGG TCGCCGTCCTCCACCCCGAG GGCCACCTCA TCACCCCGGA 501 GTGGCTTTGG GAGAAGTACG GCCTCAGGCCGGAGCAGTGG GTGGACTTCC 551 GCGCCCTCGT AGGGGACCCC TCCGACAACC TCCCCGGGGTCAAGGGCATC 601 GGGGAGAAGA CCGCCCTCAA GCTCCTTAAG GAGTGGGGAA GCCTGGAAAA651 CCTCCTCAAG AACCTGGACC GGGTGAAGCC GGAAAGCGTC CGGGAGAAGA 701TCAAGGCCCA CCTGGAAGAC CTCAGGCTCT CCTTGGAGCT CTCCCGGGTG 751 CGCACCGACCTCCCCCTGGA GGTGGACCTC GCCCAGGGGC GGGAGCCCGA 801 CCGGGAAGGG CTTAGGGCCTTCCTGGAGAG GCTAGAGTTC GGCAGCCTCC 851 TCCACGAGTT CGGCCTCCTG GAGGCCCCCGCCCCCCTGGA GGAGGCCCCC 901 TGGCCCCCGC CGGAAGGGGC CTTCGTGGGC TTCGTCCTCTCCCGCCCCGA 951 GCCCATGTGG GCGGAGCTTA AAGCCCTGGC CGCCTGCAGG GACGGCCGGG1001 TGCACCGGGC GGAGGACCCC TTGGCGGGGC TTAAGGACCT CAAGGAGGTC 1051CGGGGCCTCC TCGCCAAGGA CCTCGCCGTT TTGGCCTCGA GGGAGGGGCT 1101 AGACCTCGTGCCCGGGGACG ACCCCATGCT CCTCGCCTAC CTCCTGGACC 1151 CCTCCAACAC CACCCCCGAGGGGGTGGCGC GGCGCTACGG GGGGGAGTGG 1201 ACGGAGGACG CCGCCCAGCG GGCCCTCCTCTCGGAGAGGC TCCAGCAGAA 1251 CCTCCTTAAG CGCCTCCAGG GGGAGGAGAA GCTCCTCTGGCTCTACCACG 1301 AGGTGGAAAA GCCCCTCTCC CGGGTCCTGG CCCACATGGA GGCCACCGGG1351 GTACGGCTGG ACGTGGCCTA CCTTCAGGCC CTTTCCCTGG AGCTTGCGGA 1401GGAGATCCGC CGCCTCGAGG AGGAGGTCTT CCGCTTGGCG GGCCACCCCT 1451 TCAACCTCAACTCCCGGGAC CAGCTGGAAA GGGTGCTCTT TGACGAGCTT 1501 AGGCTTCCCG CCTTGGGGAAGACGCAAAAG ACGGGCAAGC GCTCCACCAG 1551 CGCCGCGGTG CTGGAGGCCC TACGGGAGGCCCACCCCATC GTGGAGAAGA 1601 TCCTCCAGCA CCGGGAGCTC ACCAAGCTCA AGAACACCTACGTGGACCCC 1651 CTCCCAAGCC TCGTCCACCC GAGGACGGGC CGCCTCCACA CCCGCTTCAA1701 CCAGACGGCC ACGGCCACGG GGAGGCTTAG TAGCTCCGAC CCCAACCTGC 1751AGAACATCCC CGTCCGCACC CCCTTGGGCC AGAGGATCCG CCGGGCCTTC 1801 GTAGCCGAGGCGGGATGGGC GTTGGTGGCC CTGGACTATA GCCAGATAGA 1851 GCTCCGCGTC CTCGCCCACCTCTCCGGGGA CGAGAACCTG ATCAGGGTCT 1901 TCCAGGAGGG GAAGGACATC CACACCCAGACCGCAAGCTG GATGTTCGGT 1951 GTCCCCCCGG AGGCCGTGGA CCCCCTGATG CGCCGGGCGGCCAAGACGGT 2001 GAAC TAC GGC GTCCTCTACG GCATGTCCGC CCACCGGCTC TCCCAGGAGC2051 TTTCCATCCC CTACGAGGAG GCGGTGGCCT TTATAGAGCG CTACTTCCAA 2101AGCTTCCCCA AGGTGCGGGC CTGGATAGAA AAGACCCTGG AGGAGGGGAG 2151 GAAGCGGGGCTACGTGGAAA CCCTCTTCGG AAGAAGGCGC TACGTGCCCG 2201 ACCTCAACGC CCGGGTGAAGAGCGTCAGGG AGGCCGCGGA GCGCATGGCC 2251 TTCAACATGC CCGTCCAGGG CACCGCCGCCGACCTCATGA AGCTCGCCAT 2301 GGTGAAGCTC TTCCCCCGCC TCCGGGAGAT GGGGGCCCGCATGCTCCTCC 2351 AGGTCCACGA CGAGCTCCTC CTGGAGGCCC CCCAAGCGCG GGCCGAGGAG2401 GTGGCGGCTT TGGCCAAGGA GGCCATGGAG AAGGCCTATC CCCTCGCCGT 2451ACCCCTGGAG GTGGAGGTGG GGATCGGGGA GGACTGGCTT TCCGCCAAGG 2501 GCTAG

In another embodiment, the invention provides a nucleic acid of SEQ IDNO:9, a derivative nucleic acid related to Thermus thermophilus, strain1b21, having TAC (encoding Tyr) in place of TTC (encoding Phe) atpositions 2005-07. SEQ ID NO:9 is provided below:

1 ATGGAGGCGA TGCTTCCGCT CTTTGAACCC AAAGGCCGGG TCCTCCTGGT 51 GGACGGCCACCACCTGGCCT ACCGCACCTT CTTCGCCCTG AAGGGCCTCA 101 CCACGAGCCG GGGCGAACCGGTGCAGGCGG TCTACGGCTT CGCCAAGAGC 151 CTCCTCAAGG CCCTGAAGGA GGACGGGTACAAGGCCGTCT TCGTGGTCTT 201 TGACGCCAAG GCCCCCTCCT TCCGCCACGA GGCCTACGAGGCCTACAAGG 251 CGGGGAGGGC CCCGACCCCC GAGGACTTCC CCCGGCAGCT CGCCCTCATC301 AAGGAGCTGG TGGACCTCCT GGGGTTTACC CGCCTCGAGG TCCCCGGCTA 351CGAGGCGGAC GACGTCCTCG CCACCCTGGC CAAGAAGGCG GAAAAGGAGG 401 GGTACGAGGTGCGCATCCTC ACCGCCGACC GCGACCTCTA CCAACTCGTC 451 TCCGACCGCG TCGCCGTCCTCCACCCCGAG GGCCACCTCA TCACCCCGGA 501 GTGGCTTTGG GAGAAGTACG GCCTCAAGCCGGAGCAGTGG GTGGACTTCC 551 GCGCCCTCGT GGGGGACCCC TCCGACAACC TCCCCGGGGTCAAGGGCATC 601 GGGGAGAAGA CCGCCCTCAA GCTCCTCAAG GAGTGGGGAA GCCTGGAAAA651 CCTCCTCAAG AACCTGGACC GGGTAAAGCC AGAAAACGTC CGGGAGAAGA 701TCAAGGCCCA CCTGGAAGAC CTCAGGCTTT CCTTGGAGCT CTCCCGGGTG 751 CGCACCGACCTCCCCCTGGA GGTGGACCTC GCCCAGGGGC GGGAGCCCGA 801 CCGGGAGGGG CTTAGGGCCTTCCTGGAGAG GCTGGAGTTC GGCAGCCTCC 851 TCCACGAGTT CGGCCTCCTG GAGGCCCCCGCCCCCCTGGA GGAGGCCCCC 901 TGGCCCCCGC CGGAAGGGGC CTTCGTGGGC TTCGTCCTCTCCCGCCCCGA 951 GCCCATGTGG GCGGAGCTTA AAGCCCTGGC CGCCTGCAGG GACGGCCGGG1001 TGCACCGGGC AGCAGACCCC TTGGCGGGGC TAAAGGACCT CAAGGAGGTC 1051CGGGGCCTCC TCGCCAAGGA CCTCGCCGTC TTGGCCTCGA GGGAGGGGCT 1101 AGACCTCGTGCCCGGGGACG ACCCCATGCT CCTCGCCTAC CTCCTGGACC 1151 CCTCCAACAC CACCCCCGAGGGGGTGGCGC GGCGCTACGG GGGGGAGTGG 1201 ACGGAGGACG CCGCCCACCG GGCCCTCCTCTCGGAGAGGC TCCATCGGAA 1251 CCTCCTTAAG CGCCTCGAGG GGGAGGAGAA GCTCCTTTGGCTCTACCACG 1301 AGGTGGAAAA GCCCCTCTCC CGGGTCCTGG CCCACATGGA GGCCACCGGG1351 GTACGGCTGG ACGTGGCCTA CCTTCAGGCC CTTTCCCTGG AGCTTGCGGA 1401GGAGATCCGC CGCCTCGAGG AGGAGGTCTT CCGCTTGGCG GGCCACCCCT 1451 TCAACCTCAACTCCCGGGAC CAGCTGGAAA GGGTGCTCTT TGACGAGCTT 1501 AGGCTTCCCG CCTTGGGGAAGACGCAAAAG ACGGGCAAGC GCTCCACCAG 1551 CGCCGCGGTG CTGGAGGCCC TACGGGAGGCCCACCCCATC GTGGAGAAGA 1601 TCCTCCAGCA CCGGGAGCTC ACCAAGCTCA AGAACACCTACGTGGACCCC 1651 CTCCCAAGCC TCGTCCACCC GAGGACGGGC CGCCTCCACA CCCGCTTCAA1701 CCAGACGGCC ACGGCCACGG GGAGGCTTAG TAGCTCCGAC CCCAACCTGC 1751AGAACATCCC CGTCCGCACC CCCTTGGGCC AGAGGATCCG CCGGGCCTTC 1801 GTGGCCGAGGCGGGATGGGC GTTGGTGGCC CTGGACTATA GCCAGATAGA 1851 GCTCCGCGTC CTCGCCCACCTCTCCGGGGA CGAGAACCTG ATCAGGGTCT 1901 TCCAGGAGGG GAAGGACATC CACACCCAGACCGCAAGCTG GATGTTCGGC 1951 GTCCCCCCGG AGGCCGTGGA CCCCCTGATG CGCCGGGCGGCCAAGACGGT 2001 GAAC TAC GGC GTCCTCTACG GCATGTCCGC CCATAGGCTC TCCCAGGAGC2051 TTGCCATCCC CTACGAGGAG GCGGTGGCCT TTATAGAGCG CTACTTCCAA 2101AGCTTCCCCA AGGTGCGGGC CTGGATAGAA AAGACCCTGG AGGAGGGGAG 2151 AAAGCGGGGCTACGTGGAAA CCCTCTTCGG AAGAAGGCGC TACGTGCCCG 2201 ACCTCAACGC CCGGGTGAAGAGCGTCAGGG AGGCCGCGGA GCGCATGGCC 2251 TTCAACATGC CCGTCCAGGG CACCGCCGCCGACCTCATGA AGCTCGCCAT 2301 GGTGAAGCTC TTCCCCCGCC TCCGGGAGAT GGGGGCCCGCATGCTCCTCC 2351 AGGTCCACGA CGAGCTCCTC CTGGAGGCCC CCCAAGCGCG GGCCGAGGAG2401 GTGGCGGCTT TGGCCAAGGA GGCCATGGAG AAGGCCTATC CCCTCGCCGT 2451GCCCCTGGAG GTGGAGGTGG GGATGGGGGA GGACTGGCTT TCCGCCAAGG 2501 GTTAG

In another embodiment, the invention provides a nucleic acid of SEQ IDNO:10, a derivative nucleic acid related to Thermus thermophilus, strainGK24, having GAC (encoding Asp) in place of GGC (encoding Gly) atpositions 136-138, and having TAC (encoding Tyr) in place of TTC(encoding Phe) at positions 2005-07. SEQ ID NO:10 is provided below:

1 ATGGAGGCGA TGCTTCCGCT CTTTGAACCC AAAGGCCGGG TCCTCCTGGT 51 GGACGGCCACCACCTGGCCT ACCGCACCTT CTTCGCCCTG AAGGGCCTCA 101 CCACGAGCCG GGGCGAACCGGTGCAGGCGG TCTAC GAC TT CGCCAAGAGC 151 CTCCTCAAGG CCCTGAAGGA GGACGGGTACAAGGCCGTCT TCGTGGTCTT 201 TGACGCCAAG GCCCCCTCCT TCCGCCACGA GGCCTACGAGGCCTACAAGG 251 CGGGGAGGGC CCCGACCCCC GAGGACTTCC CCCGGCAGCT CGCCCTCATC301 AAGGAGCTGG TGGACCTCCT GGGGTTTACC CGCCTCGAGG TCCCCGGCTA 351CGAGGCGGAC GACGTCCTCG CCACCCTGGC CAAGAAGGCG GAAAAGGAGG 401 GGTACGAGGTGCGCATCCTC ACCGCCGACC GCGACCTCTA CCAACTCGTC 451 TCCGACCGCG TCGCCGTCCTCCACCCCGAG GGCCACCTCA TCACCCCGGA 501 GTGGCTTTGG CAGAAGTACG GCCTCAAGCCGGAGCAGTGG GTGGACTTCC 551 GCGCCCTCGT GGGGGACCCC TCCGACAACC TCCCCGGGGTCAAGGGCATC 601 GGGGAGAAGA CCGCCCTCAA GCTCCTCAAG GAGTGGGGAA GCCTGGAAAA651 CCTCCTCAAG AACCTGGACC GGGTAAAGCC AGAAAACGTC CGGGAGAAGA 701TCAAGGCCCA CCTGGAAGAC CTCAGGCTTT CCTTGGAGCT CTCCCGGGTG 751 CGCACCGACCTCCCCCTGGA GGTGGACCTC GCCCAGGGGC GGGAGCCCGA 801 CCGGGAGGGG CTTAGGGCCTTCCTGGAGAG GCTGGAGTTC GGCAGCCTCC 851 TCCACGAGTT CGGCCTCCTG GAGGCCCCCGCCCCCCTGGA GGAGGCCCCC 901 TGGCCCCCGC CGGAAGGGGC CTTCGTGGGC TTCGTCCTCTCCCGCCCCGA 951 GCCCATGTGG GCGGAGCTTA AAGCCCTGGC CGCCTGCAGG GACGGCCGGG1001 TGCACCGGGC AGCGGACCCC TTGGCGGGGC TAAAGGACCT CAAGGAGGTC 1051CGGGGCCTCC TCGCCAAGGA CCTCGCCGTC TTGGCCTCGA GGGAGGGGCT 1101 AGACCTCGTGCCCGGGGACG ACCCCATGCT CCTCGCCTAC CTCCTGGACC 1151 CCTCCAACAC CACCCCCGAGGGGGTGGCGC GGCGCTACGG GGGGGAGTGG 1201 ACGGAGGACG CCGCCCACCG GGCCCTCCTCTCGGAGAGGC TCCATCGGAA 1251 CCTCCTTAAG CGCCTCCAGG GGGAGGAGAA GCTCCTTTGGCTCTACCACG 1301 AGGTGGAAAA GCCCCTCTCC CGGGTCCTGG CCCACATGGA GGCCACCGGG1351 GTACGGCTGG ACGTGGCCTA CCTGCAGGCC CTTTCCCTGG AGCTTGCGGA 1401GGAGATCCGC CGCCTCGAGG AGGAGGTCTT CCGCTTGGCG GGCCACCCCT 1451 TCAACCTCAACTCCCGGGAC CAGCTGGAGA GGGTGCTCTT TGACGAGCTT 1501 AGGCTTCCCG CCTTGGGGAAGACGCAAAAG ACGGGCAAGC GCTCCACCAG 1551 CGCCGCGGTG CTGGAGGCCC TACGGGAGGCCCACCCCATC GTGGAGAAGA 1601 TCCTCCAGCA CCGGGAGCTC ACCAAGCTCA AGAACACCTACGTGGACCCC 1651 CTCCCAAGCC TCGTCCACCC GAATACGGGC CGCCTCCACA CCCGCTTCAA1701 CCAGACGGCC ACGGCCACGG GGAGGCTTAG TAGCTCCGAC CCCAACCTGC 1751AGAACATCCC CGTCCGCACC CCCTTGGGCC AGAGGATCCG CCGGGCCTTC 1801 GTGGCCGAGGCGGGTTGGGC GTTGGTGGCC CTGGACTATA GCCAGATAGA 1851 GCTCCGCGTC CTCGCCCACCTCTCCGGGGA CGAGAACCTG ATCAGGGTCT 1901 TCCAGGAGGG GAAGGACATC CACACCCAGACCGCAAGCTG GATGTTCGGC 1951 GTCCCCCCGG AGGCCGTGGA TCCCCTGATG CGCCGGGCGGCCAAGACGGT 2001 GAAC TAC GGC GTCCTCTACG GCATGTCCGC CCATAGGCTC TCCCAGGAGC2051 TTGCCATCCC CTACGAGGAG GCGGTGGCCT TTATAGAGCG CTACTTCCAA 2101AGCTTCCCCA AGGTGCGGGC CTGGATAGAA AAGACCCTGG AGGAGGGGAG 2151 GAAGCGGGGCTACGTGGAAA CCCTCTTCGG AAGAAGGCGC TACGTGCCCG 2201 ACCTCAACGC CCGGGTGAAGAGCGTCAGGG AGGCCGCGGA GCGCATGGCC 2251 TTCAACATGC CCGTCCAGGG CACCGCCGCCGACCTCATGA AGCTCGCCAT 2301 GGTGAAGCTC TTCCCCCGCC TCCGGGAGAT GGGGGCCCGCATGCTCCTCC 2351 AGGTCCACGA CGAGCTCCTC CTGGAGGCCC CCCAAGCGCG GGCCGAGGAG2401 GTGGCGGCTT TGGCCAAGGA GGCCATGGAG AAGGCCTATC CCCTCGCCGT 2451GCCCCTGGAG GTGGAGGTGG GGATGGGGGA GGACTGGCTT TCCGCCAAGG 2501 GTTAG

In another embodiment, the invention provides a nucleic acid of SEQ IDNO:11, a derivative nucleic acid related to Thermus thermophilus, strainRQ-1, having GAC (encoding Asp) in place of GGC (encoding Gly) atpositions 136-138, and having TAC (encoding Tyr) in place of TTC(encoding Phe) at positions 2005-07. SEQ ID NO:11 is provided below:

1 ATGGAGGCGA TGCTTCCGCT CTTTGAACCC AAAGGCCGGG TCCTCCTGGT 51 GGACGGCCACCACCTGGCCT ACCGCACCTT CTTCGCCCTG AAGGGCCTCA 101 CCACGAGCCG GGGCGAACCGGTGCAGGCGG TCTAC GAC TT CGCCAAGAGC 151 CTCCTCAAGG CCCTGAAGGA GGACGGGTACAAGGCCGTCT TCGTGGTCTT 201 TGACGCCAAG GCCCCCTCCT TCCGCCACGA GGCCTACGAGGCCTACAAGG 251 CGGGGAGGGC CCCGACCCCC GAGGACTTCC CCCGGCAGCT CGCCCTCATC301 AAGGAGCTGG TGGACCTCTT GGGGTTTACT CGCCTCGAGG TCCCGGGCTT 351TGAGGCGGAC GACGTCCTCG CCACCCTGGC CAAGAAGGCG GAAAAAGAAG 401 GGTACGAGGTGCGCATCCTC ACCGCCGACC GGGACCTCTA CCAGCTCGTC 451 TCCGACCGGG TCGCCGTCCTCCACCCCGAG GGCCACCTCA TCACCCCGGA 501 GTGGCTTTGG GAGAAGTACG GCCTCAGGCCGGAGCAGTGG GTGGACTTCC 551 GCGCCCTCGT AGGGGACCCC TCCGACAACC TCCCCGGGGTCAAGGGCATC 601 GGGGAGAAGA CCGCCCTCAA GCTCCTTAAG GAGTGGGGAA GCCTGGAAAA651 CCTCCTCAAG AACCTGGACC GGGTGAAGCC GGAAAGCGTC CGGGAGAAGA 701TCAAGGCCCA CCTGGAAGAC CTCAGGCTCT CCTTGGAGCT CTCCCGGGTG 751 CGCACCGACCTCCCCCTGGA GGTGGACCTC GCCCAGGGGC GGGAGCCCGA 801 CCGGGAAGGG CTTAGGGCCTTCCTGGAGAG GCTAGAGTTC GGCAGCCTCC 851 TCCACGAGTT CGGCCTCCTG GAGGCCCCCGCCCCCCTGGA GGAGGCCCCC 901 TGGCCCCCGC CGGAAGGGGC CTTCGTGGGC TTCGTCCTCTCCCGCCCCGA 951 GCCCATGTGG GCGGAGCTTA AAGCCCTGGC CGCCTGCAGG GACGGCCGGG1001 TGCACCGGGC GGAGGACCCC TTGGCGGGGC TTAAGGACCT CAAGGAGGTC 1051CGGGGCCTCC TCGCCAAGGA CCTCGCCGTT TTGGCCTCGA GGGAGGGGCT 1101 AGACCTCGTGCCCGGGGACG ACCCCATGCT CCTCGCCTAC CTCCTGGACC 1151 CCTCCAACAC CACCCCCGAGGGGGTGGCGC GGCGCTACGG GGGGGAGTGG 1201 ACGGAGGACG CCGCCCAGCG GGCCCTCCTCTCGGAGAGGC TCCAGCAGAA 1251 CCTCCTTAAG CGCCTCCAGG GGGAGGAGAA GCTCCTCTGGCTCTACCACG 1301 AGGTGGAAAA GCCCCTCTCC CGGGTCCTGG CCCACATGGA GGCCACCGGG1351 GTACGGCTGG ACGTGGCCTA CCTTCAGGCC CTTTCCCTGG AGCTTGCGGA 1401GGAGATCCGC CGCCTCGAGG AGGAGGTCTT CCGCTTGGCG GGCCACCCCT 1451 TCAACCTCAACTCCCGGGAC CAGCTGGAAA GGGTGCTCTT TGACGAGCTT 1501 AGGCTTCCCG CCTTGGGGAAGACGCAAAAG ACGGGCAAGC GCTCCACCAG 1551 CGCCGCGGTG CTGGAGGCCC TACGGGAGGCCCACCCCATC GTGGAGAAGA 1601 TCCTCCAGCA CCGGGAGCTC ACCAAGCTCA AGAACACCTACGTGGACCCC 1651 CTCCCAAGCC TCGTCCACCC GAGGACGGGC CGCCTCCACA CCCGCTTCAA1701 CCAGACGGCC ACGGCCACGG GGAGGCTTAG TAGCTCCGAC CCCAACCTGC 1751AGAACATCCC CGTCCGCACC CCCTTGGGCC AGAGGATCCG CCGGGCCTTC 1801 GTAGCCGAGGCGGGATGGGC GTTGGTGGCC CTGGACTATA GCCAGATAGA 1851 GCTCCGCGTC CTCGCCCACCTCTCCGGGGA CGAGAACCTG ATCAGGGTCT 1901 TCCAGGAGGG GAAGGACATC CACACCCAGACCGCAAGCTG GATGTTCGGT 1951 GTCCCCCCGG AGGCCGTGGA CCCCCTGATG CGCCGGGCGGCCAAGACGGT 2001 GAAC TAC GGC GTCCTCTACG GCATGTCCGC CCACCGGCTC TCCCAGGAGC2051 TTTCCATCCC CTACGAGGAG GCGGTGGCCT TTATAGAGCG CTACTTCCAA 2101AGCTTCCCCA AGGTGCGGGC CTGGATAGAA AAGACCCTGG AGGAGGGGAG 2151 GAAGCGGGGCTACGTGGAAA CCCTCTTCGG AAGAAGGCGC TACGTGCCCG 2201 ACCTCAACGC CCGGGTGAAGAGCGTCAGGG AGGCCGCGGA GCGCATGGCC 2251 TTCAACATGC CCGTCCAGGG CACCGCCGCCGACCTCATGA AGCTCGCCAT 2301 GGTGAAGCTC TTCCCCCGCC TCCGGGAGAT GGGGGCCCGCATGCTCCTCC 2351 AGGTCCACGA CGAGCTCCTC CTGGAGGCCC CCCAAGCGCG GGCCGAGGAG2401 GTGGCGGCTT TGGCCAAGGA GGCCATGGAG AAGGCCTATC CCCTCGCCGT 2451ACCCCTGGAG GTGGAGGTGG GGATCGGGGA GGACTGGCTT TCCGCCAAGG 2501 GCTAG

In another embodiment, the invention provides a nucleic acid of SEQ IDNO:12, a derivative nucleic acid related to Thermus thermophilus, strain1b21, having GAC (encoding Asp) in place of GGC (encoding Gly) atpositions 136-138, and having TAC (encoding Tyr) in place of TTC(encoding Phe) at positions 2005-07. SEQ ID NO:12 is provided below:

1 ATGGAGGCGA TGCTTCCGCT CTTTGAACCC AAAGGCCGGG TCCTCCTGGT 51 GGACGGCCACCACCTGGCCT ACCGCACCTT CTTCGCCCTG AAGGGCCTCA 101 CCACGAGCCG GGGCGAACCGGTGCAGGCGG TCTAC GAC TT CGCCAAGAGC 151 CTCCTCAAGG CCCTGAAGGA GGACGGGTACAAGGCCGTCT TCGTGGTCTT 201 TGACGCCAAG GCCCCCTCCT TCCGCCACGA GGCCTACGAGGCCTACAAGG 251 CGGGGAGGGC CCCGACCCCC GAGGACTTCC CCCGGCAGCT CGCCCTCATC301 AAGGAGCTGG TGGACCTCCT GGGGTTTACC CGCCTCGAGG TCCCCGGCTA 351CGAGGCGGAC GACGTCCTCG CCACCCTGGC CAAGAAGGCG GAAAAGGAGG 401 GGTACGAGGTGCGCATCCTC ACCGCCGACC GCGACCTCTA CCAACTCGTC 451 TCCGACCGCG TCGCCGTCCTCCACCCCGAG GGCCACCTCA TCACCCCGGA 501 GTGGCTTTGG GAGAAGTACG GCCTCAAGCCGGAGCAGTGG GTGGACTTCC 551 GCGCCCTCGT GGGGGACCCC TCCGACAACC TCCCCGGGGTCAAGGGCATC 601 GGGGAGAAGA CCGCCCTCAA GCTCCTCAAG GAGTGGGGAA GCCTGGAAAA651 CCTCCTCAAG AACCTGGACC GGGTAAAGCC AGAAAACGTC CGGGAGAAGA 701TCAAGGCCCA CCTGGAAGAC CTCAGGCTTT CCTTGGAGCT CTCCCGGGTG 751 CGCACCGACCTCCCCCTGGA GGTGGACCTC GCCCAGGGGC GGGAGCCCGA 801 CCGGGAGGGG CTTAGGGCCTTCCTGGAGAG GCTGGAGTTC GGCAGCCTCC 851 TCCACGAGTT CGGCCTCCTG GAGGCCCCCGCCCCCCTGGA GGAGGCCCCC 901 TGGCCCCCGC CGGAAGGGGC CTTCGTGGGC TTCGTCCTCTCCCGCCCCGA 951 GCCCATGTGG GCGGAGCTTA AAGCCCTGGC CGCCTGCAGG GACGGCCGGG1001 TGCACCGGGC AGCAGACCCC TTGGCGGGGC TAAAGGACCT CAAGGAGGTC 1051CGGGGCCTCC TCGCCAAGGA CCTCGCCGTC TTGGCCTCGA GGGAGGGGCT 1101 AGACCTCGTGCCCGGGGACG ACCCCATGCT CCTCGCCTAC CTCCTGGACC 1151 CCTCCAACAC CACCCCCGAGGGGGTGGCGC GGCGCTACGG GGGGGAGTGG 1201 ACGGAGGACG CCGCCCACCG GGCCCTCCTCTCGGAGAGGC TCCATCGGAA 1251 CCTCCTTAAG CGCCTCGAGG GGGAGGAGAA GCTCCTTTGGCTCTACCACG 1301 AGGTGGAAAA GCCCCTCTCC CGGGTCCTGG CCCACATGGA GGCCACCGGG1351 GTACGGCTGG ACGTGGCCTA CCTTCAGGCC CTTTCCCTGG AGCTTGCGGA 1401GGAGATCCGC CGCCTCGAGG AGGAGGTCTT CCGCTTGGCG GGCCACCCCT 1451 TCAACCTCAACTCCCGGGAC CAGCTGGAAA GGGTGCTCTT TGACGAGCTT 1501 AGGCTTCCCG CCTTGGGGAAGACGCAAAAG ACGGGCAAGC GCTCCACCAG 1551 CGCCGCGGTG CTGGAGGCCC TACGGGAGGCCCACCCCATC GTGGAGAAGA 1601 TCCTCCAGCA CCGGGAGCTC ACCAAGCTCA AGAACACCTACGTGGACCCC 1651 CTCCCAAGCC TCGTCCACCC GAGGACGGGC CGCCTCCACA CCCGCTTCAA1701 CCAGACGGCC ACGGCCACGG GGAGGCTTAG TAGCTCCGAC CCCAACCTGC 1751AGAACATCCC CGTCCGCACC CCCTTGGGCC AGAGGATCCG CCGGGCCTTC 1801 GTGGCCGAGGCGGGATGGGC GTTGGTGGCC CTGGACTATA GCCAGATAGA 1851 GCTCCGCGTC CTCGCCCACCTCTCCGGGGA CGAGAACCTG ATCAGGGTCT 1901 TCCAGGAGGG GAAGGACATC CACACCCAGACCGCAAGCTG GATGTTCGGC 1951 GTCCCCCCGG AGGCCGTGGA CCCCCTGATG CGCCGGGCGGCCAAGACGGT 2001 GAAC TAC GGC GTCCTCTACG GCATGTCCGC CCATAGGCTC TCCCAGGAGC2051 TTGCCATCCC CTACGAGGAG GCGGTGGCCT TTATAGAGCG CTACTTCCAA 2101AGCTTCCCCA AGGTGCGGGC CTGGATAGAA AAGACCCTGG AGGAGGGGAG 2151 AAAGCGGGGCTACGTGGAAA CCCTCTTCGG AAGAAGGCGC TACGTGCCCG 2201 ACCTCAACGC CCGGGTGAAGAGCGTCAGGG AGGCCGCGGA GCGCATGGCC 2251 TTCAACATGC CCGTCCAGGG CACCGCCGCCGACCTCATGA AGCTCGCCAT 2301 GGTGAAGCTC TTCCCCCGCC TCCGGGAGAT GGGGGCCCGCATGCTCCTCC 2351 AGGTCCACGA CGAGCTCCTC CTGGAGGCCC CCCAAGCGCG GGCCGAGGAG2401 GTGGCGGCTT TGGCCAAGGA GGCCATGGAG AAGGCCTATC CCCTCGCCGT 2451GCCCCTGGAG GTGGAGGTGG GGATGGGGGA GGACTGGCTT TCCGCCAAGG 2501 GTTAG

The substitution of TAC (encoding Tyr) for TTC (encoding Phe) at theindicated positions can reduce discrimination against ddNTPincorporation by DNA polymerase I. See, e.g., U.S. Pat. No. 5,614,365that is incorporated herein by reference. The substitution of GAC(encoding Asp) for GGG (encoding Gly) at the indicated positions removesthe 5′-3′ exonuclease activity.

The nucleic acids of the invention have homology to portions of thenucleic acids encoding the thermostable DNA polymerases of Thermusthermophilus (see FIGS. 1A and 1B and FIGS. 2A, 2B, and 2C). However,significant portions of the nucleic acid sequences of the presentinvention are distinct.

The invention also encompasses fragment and variant nucleic acids of SEQID NO:1-12. Nucleic acid “fragments” encompassed by the invention are oftwo general types. First, fragment nucleic acids that do not encode afull-length nucleic acid polymerase but do encode a thermally stablepolypeptide with nucleic acid polymerase activity are encompassed withinthe invention. Second, fragment nucleic acids useful as hybridizationprobes but that generally do not encode polymerases retaining biologicalactivity are also encompassed within the invention. Thus, fragments ofnucleotide sequences such as SEQ ID NO:1-12 may be as small as about 9nucleotides, about 12 nucleotides, about 15 nucleotides, about 17nucleotides, about 18 nucleotides, about 20 nucleotides, about 50nucleotides, about 100 nucleotides or more. In general, a fragmentnucleic acid of the invention can have any upper size limit so long asit is related in sequence to the nucleic acids of the invention but isnot full length.

As indicated above, “variants” are substantially similar orsubstantially homologous sequences. For nucleotide sequences, variantsinclude those sequences that, because of the degeneracy of the geneticcode, encode the identical amino acid sequence of the native nucleicacid polymerase protein. Variant nucleic acids also include those thatencode polypeptides that do not have amino acid sequences identical tothat of a native nucleic acid polymerase protein, but that encode anactive, thermally stable nucleic acid polymerase with conservativechanges in the amino acid sequence.

As is known by one of skill in the art, the genetic code is“degenerate,” meaning that several trinucleotide codons can encode thesame amino acid. This degeneracy is apparent from Table 1.

TABLE 1 1^(st) 3^(rd) Posi- Second Position Posi- tion T C A G tion TTTT = Phe TCT = Ser TAT = Tyr TGT = Cys T T TTC = Phe TCC = Ser TAC =Tyr TGC = Cys C T TTA = Leu TCA = Ser TAA = Stop TGA = Stop A T TTG =Leu TCG = Ser TAG = Stop TGG = Trp G C CTT = Leu CCT = Pro CAT = His CGT= Arg T C CTC = Leu CCC = Pro CAC = His CGC = Arg C C CTA = Leu CCA =Pro CAA = Gln CGA = Arg A C CTG = Leu CCG = Pro CAG = Gln CGG = Arg G AATT = Ile ACT = Thr AAT = Asn AGT = Ser T A ATC = Ile ACC = Thr AAC =Asn AGC = Ser C A ATA = Ile ACA = Thr AAA = Lys AGA = Arg A A ATG = MetACG = Thr AAG = Lys AGG = Arg G G GTT = Val GCT = Ala GAT = Asp GGT =Gly T G GTC = Val GCC = Ala GAC = Asp GGC = Gly C G GTA = Val GCA = AlaGAA = Gln GGA = Gly A G GTG = Val GCG = Ala GAG = Gln GGG = Gly GHence, many changes in the nucleotide sequence of the variant may besilent and may not alter the amino acid sequence encoded by the nucleicacid. Where nucleic acid sequence alterations are silent, a variantnucleic acid will encode a polypeptide with the same amino acid sequenceas the reference nucleic acid. Therefore, a particular nucleic acidsequence of the invention also encompasses variants with degeneratecodon substitutions, and complementary sequences thereof, as well as thesequence explicitly specified by a SEQ ID NO. Specifically, degeneratecodon substitutions may be achieved by generating sequences in which thereference codon is replaced by any of the codons for the amino acidspecified by the reference codon. In general, the third position of oneor more selected codons can be substituted with mixed-base and/ordeoxyinosine residues as disclosed by Batzer et al., Nucleic Acid Res.,19, 5081 (1991) and/or Ohtsuka et al., J. Biol. Chem., 260, 2605 (1985);Rossolini et al., Mol. Cell. Probes, 8, 91 (1994).

However, the invention is not limited to silent changes in the presentnucleotide sequences but also includes variant nucleic acid sequencesthat conservatively alter the amino acid sequence of a polypeptide ofthe invention. According to the present invention, variant and referencenucleic acids of the invention may differ in the encoded amino acidsequence by one or more substitutions, additions, insertions, deletions,fusions and truncations, which may be present in any combination, solong as an active, thermally stable nucleic acid polymerase is encodedby the variant nucleic acid. Such variant nucleic acids will not encodeexactly the same amino acid sequence as the reference nucleic acid, buthave conservative sequence changes.

Variant nucleic acids with silent and conservative changes can bedefined and characterized by the degree of homology to the referencenucleic acid. Preferred variant nucleic acids are “substantiallyhomologous” to the reference nucleic acids of the invention. Asrecognized by one of skill in the art, such substantially similarnucleic acids can hybridize under stringent conditions with thereference nucleic acids identified by SEQ ID NOs herein. These types ofsubstantially homologous nucleic acids are encompassed by thisinvention.

Generally, nucleic acid derivatives and variants of the invention willhave at least 90%, 91%, 92%, 93% or 94% sequence identity to thereference nucleotide sequence defined herein. Preferably, nucleic acidsof the invention will have at least at least 95%, 96%, 97%, 98%, or 99%sequence identity to the reference nucleotide sequence defined herein.

Variant nucleic acids can be detected and isolated by standardhybridization procedures.

Hybridization to detect or isolate such sequences is generally carriedout under stringent conditions. “Stringent hybridization conditions” and“stringent hybridization wash conditions” in the context of nucleic acidhybridization experiments such as Southern and Northern hybridizationare sequence dependent, and are different under different environmentalparameters. Longer sequences hybridize specifically at highertemperatures. An extensive guide to the hybridization of nucleic acidsis found in Tijssen, Laboratory Techniques in Biochemistry and Molecularbiology-Hybridization with Nucleic Acid Probes, page 1, chapter 2“Overview of principles of hybridization and the strategy of nucleicacid probe assays” Elsevier, New York (1993). See also, J. Sambrook etal., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press,N.Y., pp 9.31-9.58 (1989); J. Sambrook et al., Molecular Cloning: ALaboratory Manual, Cold Spring Harbor Press, N.Y. (3rd ed. 2001).

The invention also provides methods for detection and isolation ofderivative or variant nucleic acids encoding nucleic acid polymeraseactivity. The methods involve hybridizing at least a portion of anucleic acid comprising any one of SEQ ID NO:1-12 to a sample nucleicacid, thereby forming a hybridization complex; and detecting thehybridization complex. The presence of the complex correlates with thepresence of a derivative or variant nucleic acid encoding at least asegment of nucleic acid polymerase. In general, the portion of a nucleicacid comprising any one of SEQ ID NO:1-12 used for hybridization is atleast fifteen nucleotides, and hybridization is under hybridizationconditions that are sufficiently stringent to permit detection andisolation of substantially homologous nucleic acids. In an alternativeembodiment, a nucleic acid sample is amplified by the polymerase chainreaction using primer oligonucleotides selected from any one of SEQ IDNO:1-12.

Generally, highly stringent hybridization and wash conditions areselected to be about 5° C. lower than the thermal melting point (T_(m))for the specific double-stranded sequence at a defined ionic strengthand pH. For example, under “highly stringent conditions” or “highlystringent hybridization conditions” a nucleic acid will hybridize to itscomplement to a detectably greater degree than to other sequences (e.g.,at least 2-fold over background). By controlling the stringency of thehybridization and/or washing conditions, nucleic acids that are 100%complementary can be identified.

Alternatively, stringency conditions can be adjusted to allow somemismatching in sequences so that lower degrees of similarity aredetected (heterologous probing). Typically, stringent conditions will bethose in which the salt concentration is less than about 1.5 M Na ion,typically about 0.01 to 1.0 M Na ion concentration (or other salts) atpH 7.0 to 8.3 and the temperature is at least about 30° C. for shortprobes (e.g., 10 to 50 nucleotides) and at least about 60° C. for longprobes (e.g., greater than 50 nucleotides). Stringent conditions mayalso be achieved with the addition of destabilizing agents such asformamide.

Exemplary low stringency conditions include hybridization with a buffersolution of 30 to 35% formamide, 1 M NaCl, 1% SDS (sodium dodecylsulphate) at 37° C., and a wash in 1× to 2×SSC (20×SSC=3.0 M NaCl and0.3 M trisodium citrate) at 50 to 55° C. Exemplary moderate stringencyconditions include hybridization in 40 to 45% formamide, 1.0 M NaCl, 1%SDS at 37° C., and a wash in 0.5× to 1×SSC at 55 to 60° C. Exemplaryhigh stringency conditions include hybridization in 50% formamide, 1 MNaCl, 1% SDS at 37° C., and a wash in 0.1×SSC at 60 to 65° C.

The degree of complementarity or homology of hybrids obtained duringhybridization is typically a function of post-hybridization washes, thecritical factors being the ionic strength and temperature of the finalwash solution. The type and length of hybridizing nucleic acids alsoaffects whether hybridization will occur and whether any hybrids formedwill be stable under a given set of hybridization and wash conditions.For DNA-DNA hybrids, the T_(m) can be approximated from the equation ofMeinkoth and Wahl Anal. Biochem. 138:267-284 (1984); T_(m) 81.5° C.+16.6(log M)+0.41 (% GC)−0.61 (% form)−500/L; where M is the molarity ofmonovalent cations, % GC is the percentage of guanosine and cytosinenucleotides in the DNA, % form is the percentage of formamide in thehybridization solution, and L is the length of the hybrid in base pairs.The T_(m) is the temperature (under defined ionic strength and pH) atwhich 50% of a complementary target sequence hybridizes to a perfectlymatched probe.

Very stringent conditions are selected to be equal to the T_(m) for aparticular probe.

An example of stringent hybridization conditions for hybridization ofcomplementary nucleic acids that have more than 100 complementaryresidues on a filter in a Southern or Northern blot is 50% formamidewith 1 mg of heparin at 42° C., with the hybridization being carried outovernight. An example of highly stringent conditions is 0.1 5 M NaCl at72° C. for about 15 minutes. An example of stringent wash conditions isa 0.2×SSC wash at 65° C. for 15 minutes (see also, Sambrook, infra).Often, a high stringency wash is preceded by a low stringency wash toremove background probe signal. An example of medium stringency for aduplex of, e.g., more than 100 nucleotides, is 1×SSC at 45° C. for 15minutes. An example low stringency wash for a duplex of, e.g., more than100 nucleotides, is 4-6×SSC at 40° C. for 15 minutes. For short probes(e.g., about 10 to 50 nucleotides), stringent conditions typicallyinvolve salt concentrations of less than about 1.0M Na ion, typicallyabout 0.01 to 1.0 M Na ion concentration (or other salts) at pH 7.0 to8.3, and the temperature is typically at least about 30° C.

Stringent conditions can also be achieved with the addition ofdestabilizing agents such as formamide. In general, a signal to noiseratio of 2× (or higher) than that observed for an unrelated probe in theparticular hybridization assay indicates detection of a specifichybridization. Nucleic acids that do not hybridize to each other understringent conditions are still substantially identical if the proteinsthat they encode are substantially identical. This occurs, e.g., when acopy of a nucleic acid is created using the maximum codon degeneracypermitted by the genetic code.

The following are examples of sets of hybridization/wash conditions thatmay be used to detect and isolate homologous nucleic acids that aresubstantially identical to reference nucleic acids of the presentinvention: a reference nucleotide sequence preferably hybridizes to thereference nucleotide sequence in 7% sodium dodecyl sulfate (SDS), 0.5 MNaPO₄, 1 mM EDTA at 50° C. with washing in 2×SSC, 0.1% SDS at 50° C.,more desirably in 7% sodium dodecyl sulfate (SDS), 0.5 M NaPO₄, 1 mMEDTA at 50° C. with washing in 1×SSC, 0.1% SDS at 50° C., more desirablystill in 7% sodium dodecyl sulfate (SDS), 0.5 M NaPO₄, 1 mM EDTA at 50°C. with washing in 0.5×SSC, 0.1% SDS at 50° C., preferably in 7% sodiumdodecyl sulfate (SDS), 0.5 M NaPO₄, 1 mM EDTA at 50° C. with washing in0.1×SSC, 0.1% SDS at 50° C., more preferably in 7% sodium dodecylsulfate (SDS), 0.5 M NaPO₄, 1 mM EDTA at 50° C. with washing in 0.1×SSC,0.1% SDS at 65° C.

In general, T_(m) is reduced by about 1° C. for each 1% of mismatching.Thus, T_(m), hybridization, and/or wash conditions can be adjusted tohybridize to sequences of the desired sequence identity. For example, ifsequences with >90% identity are sought, the T_(m) can be decreased 10°C. Generally, stringent conditions are selected to be about 5° C. lowerthan the thermal melting point (T_(m)) for the specific sequence and itscomplement at a defined ionic strength and pH. However, severelystringent conditions can utilize a hybridization and/or wash at 1, 2, 3,or 4° C. lower than the thermal melting point (T_(m)); moderatelystringent conditions can utilize a hybridization and/or wash at 6, 7, 8,9, or 10° C. lower than the thermal melting point (T_(m)); lowstringency conditions can utilize a hybridization and/or wash at 11, 12,13, 14, 15, or 20° C. lower than the thermal melting point (T_(m)).

If the desired degree of mismatching results in a T_(m) of less than 45°C. (aqueous solution) or 32° C. (formamide solution), it is preferred toincrease the SSC concentration so that a higher temperature can be used.An extensive guide to the hybridization of nucleic acids is found inTijssen (1993) Laboratory Techniques in Biochemistry and MolecularBiology-Hybridization with Nucleic Acid Probes, Part 1, Chapter 2(Elsevier, New York); and Ausubel et al., eds. (1995) Current Protocolsin Molecular Biology, Chapter 2 (Greene Publishing andWiley—Interscience, New York). See Sambrook et al. (1989) MolecularCloning: A Laboratory Manual (2d ed., Cold Spring Harbor LaboratoryPress, Plainview, N.Y.). Using these references and the teachings hereinon the relationship between T_(m), mismatch, and hybridization and washconditions, those of ordinary skill can generate variants of the presentnucleic acid polymerase nucleic acids.

Computer analyses can also be utilized for comparison of sequences todetermine sequence identity. Such analyses include, but are not limitedto: CLUSTAL in the PC/Gene program (available from Intelligenetics,Mountain View, Calif.); the ALIGN program (Version 2.0) and GAP,BESTFIT, BLAST, FASTA, and TFASTA in the Wisconsin Genetics SoftwarePackage, Version 8 (available from Genetics Computer Group (GCG), 575Science Drive, Madison, Wis., USA). Alignments using these programs canbe performed using the default parameters. The CLUSTAL program is welldescribed by Higgins et al. Gene 73:237 244 (1988); Higgins et al.CABIOS 5:151-153 (1989); Corpet et al. Nucleic Acids Res. 16:10881-90(1988); Huang et al. CABIOS 8:155-65 (1992); and Pearson et al. Meth.Mol. Biol. 24:307-331 (1994). The ALIGN program is based on thealgorithm of Myers and Miller, supra. The BLAST programs of Altschul etal., J. Mol. Biol. 215:403 (1990), are based on the algorithm of Karlinand Altschul supra. To obtain gapped alignments for comparison purposes,Gapped BLAST (in BLAST 2.0) can be utilized as described in Altschul etal. Nucleic Acids Res. 25:3389 (1997). Alternatively, PSI-BLAST (inBLAST 2.0) can be used to perform an iterated search that detectsdistant relationships between molecules. See Altschul et al., supra.When utilizing BLAST, Gapped BLAST, PSI-BLAST, the default parameters ofthe respective programs (e.g. BLASTN for nucleotide sequences, BLASTXfor proteins) can be used. The BLASTN program (for nucleotide sequences)uses as defaults a wordlength (W) of 11, an expectation (E) of 10, acutoff of 100, M=5, N=−4, and a comparison of both strands. For aminoacid sequences, the BLASTP program uses as defaults a wordlength (W) of3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (seeHenikoff & Henikoff, Proc. Natl. Acad. Sci. USA, 89, 10915 (1989)). Seehttp://www.ncbi.n1 m.nih.gov. Alignment may also be performed manuallyby inspection.

For purposes of the present invention, comparison of nucleotidesequences for determination of percent sequence identity to the nucleicacid polymerase sequences disclosed herein is preferably made using theBlastN program (version 1.4.7 or later) with its default parameters orany equivalent program. By “equivalent program” is intended any sequencecomparison program that, for any two sequences in question, generates analignment having identical nucleotide or amino acid residue matches andan identical percent sequence identity when compared to thecorresponding alignment generated by the preferred program.

Expression of Nucleic Acids Encoding Polymerases

Nucleic acids of the invention may be used for the recombinantexpression of the nucleic acid polymerase polypeptides of the invention.Generally, recombinant expression of a nucleic acid polymerasepolypeptide of the invention is effected by introducing a nucleic acidencoding that polypeptide into an expression vector adapted for use inparticular type of host cell. The nucleic acids of the invention can beintroduced and expressed in any host organism, for example, in bothprokaryotic or eukaryotic host cells. Examples of host cells includebacterial cells, yeast cells, cultured insect cell lines, and culturedmammalian cells lines. Preferably, the recombinant host cell system isselected that processes and post-translationally modifies nascentpolypeptides in a manner similar to that of the organism from which thenucleic acid polymerase was derived. For purposes of expressing andisolating nucleic acid polymerase polypeptides of the invention,prokaryotic organisms are preferred, for example, Escherichia coli.Accordingly, the invention provides host cells comprising the expressionvectors of the invention.

The nucleic acids to be introduced can be conveniently placed inexpression cassettes for expression in an organism of interest. Suchexpression cassettes will comprise a transcriptional initiation regionlinked to a nucleic acid of the invention. Expression cassettespreferably also have a plurality of restriction sites for insertion ofthe nucleic acid to be under the transcriptional regulation of variouscontrol elements. The expression cassette additionally may containselectable marker genes. Suitable control elements such asenhancers/promoters, splice junctions, polyadenylation signals, etc. maybe placed in close proximity to the coding region of the gene if neededto permit proper initiation of transcription and/or correct processingof the primary RNA transcript. Alternatively, the coding region utilizedin the expression vectors of the present invention may containendogenous enhancers/promoters, splice junctions, intervening sequences,polyadenylation signals, etc., or a combination of both endogenous andexogenous control elements.

Preferably the nucleic acid in the vector is under the control of, andoperably linked to, an appropriate promoter or other regulatory elementsfor transcription in a host cell. The vector may be a bi-functionalexpression vector that functions in multiple hosts. The transcriptionalcassette generally includes in the 5′-3′ direction of transcription, apromoter, a transcriptional and translational initiation region, a DNAsequence of interest, and a transcriptional and translationaltermination region functional in the organism. The termination regionmay be native with the transcriptional initiation region, may be nativewith the DNA sequence of interest, or may be derived from anothersource.

Efficient expression of recombinant nucleic acids in prokaryotic andeukaryotic cells generally requires regulatory control elementsdirecting the efficient termination and polyadenylation of the resultingtranscript. Transcription termination signals are generally founddownstream of the polyadenylation signal and are a few hundrednucleotides in length. The term “poly A site” or “poly A sequence” asused herein denotes a nucleic acid sequence that directs both thetermination and polyadenylation of the nascent RNA transcript. Efficientpolyadenylation of the recombinant transcript is desirable astranscripts lacking a poly A tail are unstable and are rapidly degraded.

Nucleic acids encoding nucleic acid polymerase may be introduced intobacterial host cells by a method known to one of skill in the art. Forexample, nucleic acids encoding a thermophilic nucleic acid polymerasecan be introduced into bacterial cells by commonly used transformationprocedures such as by treatment with calcium chloride or byelectroporation. If the thermophilic nucleic acid polymerase is to beexpressed in eukaryotic host cells, nucleic acids encoding thethermophilic nucleic acid polymerase may be introduced into eukaryotichost cells by a number of means including calcium phosphateco-precipitation, spheroplast fusion, electroporation and the like. Whenthe eukaryotic host cell is a yeast cell, transformation may be affectedby treatment of the host cells with lithium acetate or byelectroporation.

Thus, one aspect of the invention is to provide expression vectors andhost cells comprising a nucleic acid encoding a nucleic acid polymerasepolypeptide of the invention. A wide range of expression vectors areavailable in the art. Description of various expression vectors and howto use them can be found among other places in U.S. Pat. Nos. 5,604,118;5,583,023; 5,432,082; 5,266,490; 5,063,158; 4,966,841; 4,806,472;4,801,537; and Goedel et al., Gene Expression Technology, Methods ofEnzymology, Vol. 185, Academic Press, San Diego (1989). The expressionof nucleic acid polymerases in recombinant cell systems is awell-established technique. Examples of the recombinant expression ofnucleic acid polymerase can be found in U.S. Pat. Nos. 5,602,756;5,545,552; 5,541,311; 5,500,363; 5,489,523; 5,455,170; 5,352,778;5,322,785; and 4,935,361.

Recombinant DNA and molecular cloning techniques that can be used tohelp make and use aspects of the invention are described by Sambrook etal., Molecular Cloning: A Laboratory Manual Vol. 1-3, Cold Spring Harborlaboratory, Cold Spring Harbor, N.Y. (2001); Ausubel (ed.), CurrentProtocols in Molecular Biology, John Wiley and Sons, Inc. (1994); T.Maniatis, E. F. Fritsch and J. Sambrook, Molecular Cloning: A LaboratoryManual, Cold Spring Harbor laboratory, Cold Spring Harbor, N.Y. (1989);and by T. J. Silhavy, M. L. Berman, and L. W. Enquist, Experiments withGene Fusions, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.(1984).

Nucleic Acid Polymerase Enzymes

The invention provides Thermus thermophilus nucleic acid polymerasepolypeptides, as well as fragments thereof and variant nucleic acidPolymerase polypeptides that are active and thermally stable. Anypolypeptide containing amino acid sequence having any one of SEQ IDNO:13-24, which are the amino acid sequences for wild type andderivative Thermus thermophilus nucleic acid polymerases, arecontemplated by the present invention. The polypeptides of the inventionare isolated or substantially purified polypeptides. In particular, theisolated polypeptides of the invention are substantially free ofproteins normally present in Thermus thermophilus bacteria.

In one embodiment, the invention provides a polypeptide of SEQ ID NO:13,a wild type Thermus thermophilus nucleic acid polymerase polypeptidefrom strain GK24. SEQ ID NO:13 is provided below.

1 MEAMLPLFEP KGRVLLVDGH HLAYRTFFAL KGLTTSRGEP VQAVYGFAKS 50 51LLKALKEDGY KAVFVVFDAK APSFRHEAYE AYKAGRAPTP EDFPRQLALI 100 101KELVDLLGFT RLEVPGYEAD DVLATLAKKA EKEGYEVRIL TADRDLYQLV 150 151SDRVAVLHPE GHLITPEWLW QKYGLKPEQW VDFRALVGDP SDNLPGVKGI 200 201GEKTALKLLK EWGSLENLLK NLDRVKPENV REKIKAHLED LRLSLELSRV 250 251RTDLPLEVDL AQGREPDREG LRAFLERLEF GSLLHEFGLL EAPAPLEEAP 300 301WPPPEGAFVG FVLSRPEPMW AELKALAACR DGRVHRAADP LAGLKDLKEV 350 351RGLLAKDLAV LASREGLDLV PGDDPMLLAY LLDPSNTTPE GVARRYGGEW 400 401TEDAAHRALL SERLHRNLLK RLQGEEKLLW LYHEVEKPLS RVLAHMEATG 450 451VRLDVAYLQA LSLELAEEIR RLEEEVFRLA GHPFNLNSRD QLERVLFDEL 500 501RLPALGKTQK TGKRSTSAAV LEALREAHPI VEKILQHREL TKLKNTYVDP 550 551LPSLVHPNTG RLHTRFNQTA TATGRLSSSD PNLQNIPVRT PLGQRIRRAF 600 601VAEAGWALVA LDYSQIELRV LAHLSGDENL IRVFQEGKDI HTQTASWMFG 650 651VPPEAVDPLM RRAAKTVNFG VLYGMSAHRL SQELAIPYEE AVAFIERYFQ 700 701SFPKVRAWIE KTLEEGRKRG YVETLFGRRR YVPDLNARVK SVREAAERMA 750 751FNMPVQGTAA DLMKLAMVKL FPRLREMGAR MLLQVHDELL LEAPQARAEE 800 801VAALAKEAME KAYPLAVPLE VEVGMGEDWL SAKG 834

In another embodiment, the invention provides SEQ ID NO:14 a wild typeThermus thermophilus nucleic acid polymerase enzyme, from strain RQ-1.SEQ ID NO:14 is provided below.

1 MEAMLPLFEP KGRVLLVDGH HLAYRTFFAL KGLTTSRGEP VQAVYGFAKS 50 51LLKALKEDGY KAVFVVFDAK APSFRHEAYE AYKAGRAPTP EDFPRQLALI 100 101KELVDLLGFT RLEVPGFEAD DVLATLAKKA EKEGYEVRIL TADRDLYQLV 150 151SDRVAVLHPE GHLITPEWLW EKYGLRPEQW VDFRALVGDP SDNLPGVKGI 200 201GEKTALKLLK EWGSLENLLK NLDRVKPESV REKIKAHLED LRLSLELSRV 250 251RTDLPLEVDL AQGREPDREG LRAFLERLEF GSLLHEFGLL EAPAPLEEAP 300 301WPPPEGAFVG FVLSRPEPMW AELKALAACR DGRVHRAEDP LAGLKDLKEV 350 351RGLLAKDLAV LASREGLDLV PGDDPMLLAY LLDPSNTTPE GVARRYGGEW 400 401TEDAAQRALL SERLQQNLLK RLQGEEKLLW LYHEVEKPLS RVLAHMEATG 450 451VRLDVAYLQA LSLELAEEIR RLEEEVFRLA GHPFNLNSRD QLERVLFDEL 500 501RLPALGKTQK TGKRSTSAAV LEALREAHPI VEKILQHREL TKLKNTYVDP 550 551LPSLVHPRTG RLHTRFNQTA TATGRLSSSD PNLQNIPVRT PLGQRIRRAF 600 601VAEAGWALVA LDYSQIELRV LAHLSGDENL IRVFQEGKDI HTQTASWMFG 650 651VPPEAVDPLM RRAAKTVNFG VLYGMSAHRL SQELSIPYEE AVAFIERYFQ 700 701SFPKVRAWIE KTLEEGRKRG YVETLFGRRR YVPDLNARVK SVREAAERMA 750 751FNMPVQGTAA DLMKLAMVKL FPRLREMGAR MLLQVHDELL LEAPQARAEE 800 801VAALAKEAME KAYPLAVPLE VEVGIGEDWL SAKG 834

In another embodiment, the invention provides SEQ ID NO:15 a wild typeThermus thermophilus nucleic acid polymerase enzyme, from strain 1b21.SEQ ID NO:15 is provided below.

1 MEAMLPLFEP KGRVLLVDGH HLAYRTFFAL KGLTTSRGEP VQAVYGFAKS 50 51LLKALKEDGY KAVFVVFDAK APSFRHEAYE AYKAGRAPTP EDFPRQLALI 100 101KELVDLLGFT RLEVPGYEAD DVLATLAKKA EKEGYEVRIL TADRDLYQLV 150 151SDRVAVLHPE GHLITPEWLW EKYGLKPEQW VDFRALVGDP SDNLPGVKGI 200 201GEKTALKLLK EWGSLENLLK NLDRVKPENV REKIKAHLED LRLSLELSRV 250 251RTDLPLEVDL AQGREPDREG LRAFLERLEF GSLLHEFGLL EAPAPLEEAP 300 301WPPPEGAFVG FVLSRPEPMW AELKALAACR DGRVHRAADP LAGLKDLKEV 351 351RGLLAKDLAV LASREGLDLV PGDDPMLLAY LLDPSNTTPE GVARRYGGEW 400 401TEDAAHRALL SERLHRNLLK RLEGEEKLLW LYHEVEKPLS RVLAHMEATG 450 451VRLDVAYLQA LSLELAEEIR RLEEEVFRLA GHPFNLNSRD QLERVLFDEL 500 501RLPALGKTQK TGKRSTSAAV LEALREAHPI VEKILQHREL TKLKNTYVDP 550 551LPSLVHPRTG RLHTRFNQTA TATGRLSSSD PNLQNIPVRT PLGQRIRRAF 600 601VAEAGWALVA LDYSQIELRV LAHLSGDENL IRVFQEGKDI HTQTASWMFG 650 651VPPEAVDPLM RRAAKTVNFG VLYGMSAHRL SQELAIPYEE AVAFIERYFQ 700 701SFPKVRAWIE KTLEEGRKRG YVETLFGRRR YVPDLNARVK SVREAAERMA 750 751FNMPVQGTAA DLMKLAMVKL FPRLREMGAR MLLQVHDELL LEAPQARAEE 800 801VAALAKEAME KAYPLAVPLE VEVGMGEDWL SAKG 834

The sequences of the wild type Thermus thermophilus nucleic acidpolymerases of the invention are distinct from the amino acid sequenceof known Thermus thermophilus DNA Polymerases. For example, comparisonof the Thermus thermophilus, strain GK24 amino acid sequence (SEQ IDNO:13) with a published GK24 DNA Polymerase I sequence from Kwon et al.,(Mol Cells. 1997 Apr. 30; 7 (2):264-71) reveals that SEQ ID NO:13 hasfour changes versus the Kwon sequence: Asn129→Lys, Pro130→Ala,Asp147→Tyr, and Gly797→Arg. These four positions are identified in FIGS.1A and 1B. In each of these four positions, the Kwon GK24 DNA PolymeraseI and a polypeptide with SEQ ID NO:13 have amino acids with dramaticallydifferent chemical properties. Asparagine (Kwon) at position 129 is apolar, uncharged amino acid side chain whereas lysine (SEQ ID NO:13) ischarged and basic. Proline (Kwon) and alanine (SEQ ID NO:13) at position130 are both aliphatic but alanine promotes helix or beta sheetformation whereas proline residues generally interrupt helices andsheets. Aspartate (Kwon) at position 147 is an acidic amino acid whereastyrosine (SEQ ID NO:13) is aromatic. Glycine (Kwon) at position 797 isthe smallest amino acid side chain whereas arginine (SEQ ID NO:13) hasthe longest, most basic charged side chain.

Similarly, comparison of the Thermus thermophilus, strain RQ-1 aminoacid sequence (SEQ ID NO:14) with a published amino acid sequences foravailable strains of Thermus thermophilus, indicates that the Thermusthermophilus, strain RQ-1 amino acid sequence is distinct in at leasttwo positions. At position 406, the Thermus thermophilus, strain RQ-1amino acid sequence has glutamine, whereas the available Thermusthermophilus strains have histidine. At position 685, the Thermusthermophilus, strain RQ-1 amino acid sequence has serine, whereas theavailable Thermus thermophilus strains have alanine.

Moreover, comparison of the Thermus thermophilus, strain 1b21 amino acidsequence (SEQ ID NO:15) with a published amino acid sequences foravailable strains of Thermus thermophilus, indicates that the Thermusthermophilus, strain 1b21 amino acid sequence is distinct in at leastone positions. At position 129, the Thermus thermophilus, strain 1b21amino acid sequence has lysine, whereas the published amino acidsequence for Thermus thermophilus strain HB8 (ATCC accession number466573) has arginine.

Hence, several regions of the Thermus thermophilus polymerases of theinvention differ from previously available Thermus thermophilus DNApolymerases.

Many DNA polymerases possess activities in addition to a DNA polymeraseactivity. Such activities include, for example, a 5′-3′ exonucleaseactivity and/or a 3′-5′ exonuclease activity. The 3′-5′ exonucleaseactivity improves the accuracy of the newly synthesized strand byremoving incorrect bases that may have been incorporated. DNApolymerases in which such activity is low or absent are prone to errorsin the incorporation of nucleotide residues into the primer extensionstrand. Taq DNA polymerase has been reported to have low 3′-5′exonuclease activity. See Lawyer et al., J. Biol Chem. 264:6427-6437. Inapplications such as nucleic acid amplification procedures in which thereplication of DNA is often geometric in relation to the number ofprimer extension cycles, such errors can lead to serious artifactualproblems such as sequence heterogeneity of the nucleic acidamplification product (amplicon). Thus, a 3′-5′ exonuclease activity isa desired characteristic of a thermostable DNA polymerase used for suchpurposes.

By contrast, the 5′-3′ exonuclease activity of DNA polymerase enzymes isoften undesirable because this activity may digest nucleic acids,including primers that have an unprotected 5′ end. Thus, a thermostablenucleic acid polymerase with an attenuated 5′-3′ exonuclease activity,or in which such activity is absent, is a desired characteristic of anenzyme for biochemical applications. Various DNA polymerase enzymes havebeen described where a modification has been introduced in a DNApolymerase that accomplishes this object. For example, the Klenowfragment of E. coli DNA polymerase I can be produced as a proteolyticfragment of the holoenzyme in which the domain of the proteincontrolling the 5′-3′ exonuclease activity has been removed. The Klenowfragment still retains the polymerase activity and the 3′-5′ exonucleaseactivity. Barnes, PCT Publication No. WO92/06188 (1992) and Gelfand etal., U.S. Pat. No. 5,079,352 have produced 5′-3′ exonuclease-deficientrecombinant Thermus aquaticus DNA polymerases. Ishino et al., EPOPublication No. 0517418A2, have produced a 5′-3′ exonuclease-deficientDNA polymerase derived from Bacillus caldotenax.

In another embodiment, the invention provides a polypeptide that is aderivative Thermus thermophilus polypeptide with reduced or eliminated5′-3′ exonuclease activity. Several methods exist for reducing thisactivity, and the invention contemplates any polypeptide derived fromthe Thermus thermophilus polypeptides of the invention that has reducedor eliminated such 5′-3′ exonuclease activity. See U.S. Pat. No.5,466,591; Xu et al., Biochemical and mutational studies of the 5′-3′exonuclease of DNA polymerase I of Escherichia coli. J. Mol. Biol. 1997May 2; 268(2):284-302.

In one embodiment, the invention provides a Thermus thermophilus nucleicacid polymerase polypeptide from strain GK24 in which Asp is used inplace of Gly at position 46. This polypeptide has SEQ ID NO:16 andreduced 5′-3′ exonuclease activity. SEQ ID NO:16 is provided below.

1 MEAMLPLFEP KGRVLLVDGH HLAYRTFFAL KGLTTSRGEP VQAVY D FAKS 50 51LLKALKEDGY KAVFVVFDAK APSFRHEAYE AYKAGRAPTP EDFPRQLALI 100 101KELVDLLGFT RLEVPGYEAD DVLATLAKKA EKEGYEVRIL TADRDLYQLV 150 151SDRVAVLHPE GHLITPEWLW QKYGLKPEQW VDFRALVGDP SDNLPGVKGI 200 201GEKTALKLLK EWGSLENLLK NLDRVKPENV REKIKAHLED LRLSLELSRV 250 251RTDLPLEVDL AQGREPDREG LRAFLERLEF GSLLHEFGLL EAPAPLEEAP 300 301WPPPEGAFVG FVLSRPEPMW AELKALAACR DGRVHRAADP LAGLKDLKEV 350 351RGLLAKDLAV LASREGLDLV PGDDPMLLAY LLDPSNTTPE GVARRYGGEW 400 401TEDAAHRALL SERLHRNLLK RLQGEEKLLW LYHEVEKPLS RVLAHMEATG 450 451VRLDVAYLQA LSLELAEEIR RLEEEVFRLA GHPFNLNSRD QLERVLFDEL 500 501RLPALGKTQK TGKRSTSAAV LEALREAHPI VEKILQHREL TKLKNTYVDP 550 551LPSLVHPNTG RLHTRFNQTA TATGRLSSSD PNLQNIPVRT PLGQRIRRAF 600 601VAEAGWALVA LDYSQIELRV LAHLSGDENL IRVFQEGKDI HTQTASWMFG 650 651VPPEAVDPLM RRAAKTVNFG VLYGMSAHRL SQELAIPYEE AVAFIERYFQ 700 701SFPKVRAWIE KTLEEGRKRG YVETLFGRRR YVPDLNARVK SVREAAERMA 750 751FNMPVQGTAA DLMKLAMVKL FPRLREMGAR MLLQVHDELL LEAPQARAEE 800 801VAALAKEAME KAYPLAVPLE VEVGMGEDWL SAKG 834

In one embodiment, the invention provides a Thermus thermophilus nucleicacid polymerase polypeptide from strain RQ-1 in which Asp is used inplace of Gly at position 46. This polypeptide has SEQ ID NO:17 andreduced 5′-3′ exonuclease activity. SEQ ID NO:17 is provided below.

1 MEAMLPLFEP KGRVLLVDGH HLAYRTFFAL KGLTTSRGEP VQAVY D FAKS 50 51LLKALKEDGY KAVFVVFDAK APSFRHEAYE AYKAGRAPTP EDFPRQLALI 100 101KELVDLLGFT RLEVPGFEAD DVLATLAKKA EKEGYEVRIL TADRDLYQLV 150 151SDRVAVLHPE GHLITPEWLW EKYGLRPEQW VDFRALVGDP SDNLPGVKGI 200 201GEKTALKLLK EWGSLENLLK NLDRVKPESV REKIKAHLED LRLSLELSRV 250 251RTDLPLEVDL AQGREPDREG LRAFLERLEF GSLLHEFGLL EAPAPLEEAP 300 301WPPPEGAFVG FVLSRPEPMW AELKALAACR DGRVHRAEDP LAGLKDLKEV 350 351RGLLAKDLAV LASREGLDLV PGDDPMLLAY LLDPSNTTPE GVARRYGGEW 400 401TEDAAQRALL SERLQQNLLK RLQGEEKLLW LYHEVEKPLS RVLAHMEATG 450 451VRLDVAYLQA LSLELAEEIR RLEEEVFRLA GHPFNLNSRD QLERVLFDEL 500 501RLPALGKTQK TGKRSTSAAV LEALREAHPI VEKILQHREL TKLKNTYVDP 550 551LPSLVHPRTG RLHTRFNQTA TATGRLSSSD PNLQNIPVRT PLGQRIRRAF 600 601VAEAGWALVA LDYSQIELRV LAHLSGDENL IRVFQEGKDI HTQTASWMFG 650 651VPPEAVDPLM RRAAKTVNFG VLYGMSAHRL SQELSIPYEE AVAFIERYFQ 700 701SFPKVRAWIE KTLEEGRKRG YVETLFGRRR YVPDLNARVK SVREAAERMA 750 751FNMPVQGTAA DLMKLAMVKL FPRLREMGAR MLLQVHDELL LEAPQARAEE 800 801VAALAKEAME KAYPLAVPLE VEVGIGEDWL SAKG 834

In another embodiment, the invention provides a Thermus thermophilusnucleic acid polymerase polypeptide from strain 1b21 in which Asp isused in place of Gly at position 46. This polypeptide has SEQ ID NO:18and reduced 5′-3′ exonuclease activity. SEQ ID NO:18 is provided below.

1 MEAMLPLFEP KGRVLLVDGH HLAYRTFFAL KGLTTSRGEP VQAVY D FAKS 50 51LLKALKEDGY KAVFVVFDAK APSFRHEAYE AYKAGRAPTP EDFPRQLALI 100 101KELVDLLGFT RLEVPGYEAD DVLATLAKKA EKEGYEVRIL TADRDLYQLV 150 151SDRVAVLHPE GHLITPEWLW EKYGLKPEQW VDFRALVGDP SDNLPGVKGI 200 201GEKTALKLLK EWGSLENLLK NLDRVKPENV REKIKAHLED LRLSLELSRV 250 251RTDLPLEVDL AQGREPDREG LRAFLERLEF GSLLHEFGLL EAPAPLEEAP 300 301WPPPEGAFVG FVLSRPEPMW AELKALAACR DGRVHRAADP LAGLKDLKEV 351 351RGLLAKDLAV LASREGLDLV PGDDPMLLAY LLDPSNTTPE GVARRYGGEW 400 401TEDAAHRALL SERLHRNLLK RLEGEEKLLW LYHEVEKPLS RVLAHMEATG 450 451VRLDVAYLQA LSLELAEEIR RLEEEVFRLA GHPFNLNSRD QLERVLFDEL 500 501RLPALGKTQK TGKRSTSAAV LEALREAHPI VEKILQHREL TKLKNTYVDP 550 551LPSLVHPRTG RLHTRFNQTA TATGRLSSSD PNLQNIPVRT PLGQRIRRAF 600 601VAEAGWALVA LDYSQIELRV LAHLSGDENL IRVFQEGKDI HTQTASWMFG 650 651VPPEAVDPLM RRAAKTVNFG VLYGMSAHRL SQELAIPYEE AVAFIERYFQ 700 701SFPKVRAWIE KTLEEGRKRG YVETLFGRRR YVPDLNARVK SVREAAERMA 750 751FNMPVQGTAA DLMKLAMVKL FPRLREMGAR MLLQVHDELL LEAPQARAEE 800 801VAALAKEAME KAYPLAVPLE VEVGMGEDWL SAKG 834

In another embodiment, the invention provides a polypeptide of SEQ IDNO:19, a derivative Thermus thermophilus polypeptide from strain GK24with reduced bias against ddNTP incorporation. SEQ ID NO:19 has Tyr inplace of Phe at position 669. The sequence of SEQ ID NO:19 is below.

1 MEAMLPLFEP KGRVLLVDGH HLAYRTFFAL KGLTTSRGEP VQAVYGFAKS 50 51LLKALKEDGY KAVFVVFDAK APSFRHEAYE AYKAGRAPTP EDFPRQLALI 100 101KELVDLLGFT RLEVPGYEAD DVLATLAKKA EKEGYEVRIL TADRDLYQLV 150 151SDRVAVLHPE GHLITPEWLW QKYGLKPEQW VDFRALVGDP SDNLPGVKGI 200 201GEKTALKLLK EWGSLENLLK NLDRVKPENV REKIKAHLED LRLSLELSRV 250 251RTDLPLEVDL AQGREPDREG LRAFLERLEF GSLLHEFGLL EAPAPLEEAP 300 301WPPPEGAFVG FVLSRPEPMW AELKALAACR DGRVHRAADP LAGLKDLKEV 350 351RGLLAKDLAV LASREGLDLV PGDDPMLLAY LLDPSNTTPE GVARRYGGEW 400 401TEDAAHRALL SERLHRNLLK RLQGEEKLLW LYHEVEKPLS RVLAHMEATG 450 451VRLDVAYLQA LSLELAEEIR RLEEEVFRLA GHPFNLNSRD QLERVLFDEL 500 501RLPALGKTQK TGKRSTSAAV LEALREAHPI VEKILQHREL TKLKNTYVDP 550 551LPSLVHPNTG RLHTRFNQTA TATGRLSSSD PNLQNIPVRT PLGQRIRRAF 600 601VAEAGWALVA LDYSQIELRV LAHLSGDENL IRVFQEGKDI HTQTASWMFG 650 651VPPEAVDPLM RRAAKTVN Y G VLYGMSAHRL SQELAIPYEE AVAFIERYFQ 700 701SFPKVRAWIE KTLEEGRKRG YVETLFGRRR YVPDLNARVK SVREAAERMA 750 751FNMPVQGTAA DLMKLAMVKL FPRLREMGAR MLLQVHDELL LEAPQARAEE 800 801VAALAKEAME KAYPLAVPLE VEVGMGEDWL SAKG 834

In another embodiment, the invention provides a polypeptide of SEQ IDNO:20, a derivative Thermus thermophilus polypeptide from strain RQ-1with reduced bias against ddNTP incorporation. SEQ ID NO:20 has Tyr inplace of Phe at position 669. The sequence of SEQ ID NO:20 is below.

1 MEAMLPLFEP KGRVLLVDGH HLAYRTFFAL KGLTTSRGEP VQAVYGFAKS 50 51LLKALKEDGY KAVFVVFDAK APSFRHEAYE AYKAGRAPTP EDFPRQLALI 100 101KELVDLLGFT RLEVPGFEAD DVLATLAKKA EKEGYEVRIL TADRDLYQLV 150 151SDRVAVLHPE GHLITPEWLW EKYGLRPEQW VDFRALVGDP SDNLPGVKGI 200 201GEKTALKLLK EWGSLENLLK NLDRVKPESV REKIKAHLED LRLSLELSRV 250 251RTDLPLEVDL AQGREPDREG LRAFLERLEF GSLLHEFGLL EAPAPLEEAP 300 301WPPPEGAFVG FVLSRPEPMW AELKALAACR DGRVHRAEDP LAGLKDLKEV 350 351RGLLAKDLAV LASREGLDLV PGDDPMLLAY LLDPSNTTPE GVARRYGGEW 400 401TEDAAQRALL SERLQQNLLK RLQGEEKLLW LYHEVEKPLS RVLAHMEATG 450 451VRLDVAYLQA LSLELAEEIR RLEEEVFRLA GHPFNLNSRD QLERVLFDEL 500 501RLPALGKTQK TGKRSTSAAV LEALREAHPI VEKILQHREL TKLKNTYVDP 550 551LPSLVHPRTG RLHTRFNQTA TATGRLSSSD PNLQNIPVRT PLGQRIRRAF 600 601VAEAGWALVA LDYSQIELRV LAHLSGDENL IRVFQEGKDI HTQTASWMFG 650 651VPPEAVDPLM RRAAKTVN Y G VLYGMSAHRL SQELSIPYEE AVAFIERYFQ 700 701SFPKVRAWIE KTLEEGRKRG YVETLFGRRR YVPDLNARVK SVREAAERMA 750 751FNMPVQGTAA DLMKLAMVKL FPRLREMGAR MLLQVHDELL LEAPQARAEE 800 801VAALAKEAME KAYPLAVPLE VEVGIGEDWL SAKG 834

In another embodiment, the invention provides a polypeptide of SEQ IDNO:21, a derivative Thermus thermophilus polypeptide from strain 1b21with reduced bias against ddNTP incorporation. SEQ ID NO:21 has Tyr inplace of Phe at position 669. The sequence of SEQ ID NO:21 is below.

1 MEAMLPLFEP KGRVLLVDGH HLAYRTFFAL KGLTTSRGEP VQAVYGFAKS 50 51LLKALKEDGY KAVFVVFDAK APSFRHEAYE AYKAGRAPTP EDFPRQLALI 100 101KELVDLLGFT RLEVPGYEAD DVLATLAKKA EKEGYEVRIL TADRDLYQLV 150 151SDRVAVLHPE GHLITPEWLW EKYGLKPEQW VDFRALVGDP SDNLPGVKGI 200 201GEKTALKLLK EWGSLENLLK NLDRVKPENV REKIKAHLED LRLSLELSRV 250 251RTDLPLEVDL AQGREPDREG LRAFLERLEF GSLLHEFGLL EAPAPLEEAP 300 301WPPPEGAFVG FVLSRPEPMW AELKALAACR DGRVHRAADP LAGLKDLKEV 351 351RGLLAKDLAV LASREGLDLV PGDDPMLLAY LLDPSNTTPE GVARRYGGEW 400 401TEDAAHRALL SERLHRNLLK RLEGEEKLLW LYHEVEKPLS RVLAHMEATG 450 451VRLDVAYLQA LSLELAEEIR RLEEEVFRLA GHPFNLNSRD QLERVLFDEL 500 501RLPALGKTQK TGKRSTSAAV LEALREAHPI VEKILQHREL TKLKNTYVDP 550 551LPSLVHPRTG RLHTRFNQTA TATGRLSSSD PNLQNIPVRT PLGQRIRRAF 600 601VAEAGWALVA LDYSQIELRV LAHLSGDENL IRVFQEGKDI HTQTASWMFG 650 651VPPEAVDPLM RRAAKTVN Y G VLYGMSAHRL SQELAIPYEE AVAFIERYFQ 700 701SFPKVRAWIE KTLEEGRKRG YVETLFGRRR YVPDLNARVK SVREAAERMA 750 751FNMPVQGTAA DLMKLAMVKL FPRLREMGAR MLLQVHDELL LEAPQARAEE 800 801VAALAKEAME KAYPLAVPLE VEVGMGEDWL SAKG 834

In another embodiment, the invention provides a polypeptide of SEQ IDNO:22, a derivative Thermus thermophilus polypeptide from strain GK24with reduced 5′-3′ exonuclease activity and reduced bias against ddNTPincorporation. SEQ ID NO:22 has Asp in place of Gly at position 46 andTyr in place of Phe at position 669. The sequence of SEQ ID NO:22 isbelow.

1 MEAMLPLFEP KGRVLLVDGH HLAYRTFFAL KGLTTSRGEP VQAVY D FAKS 50 51LLKALKEDGY KAVFVVFDAK APSFRHEAYE AYKAGRAPTP EDFPRQLALI 100 101KELVDLLGFT RLEVPGYEAD DVLATLAKKA EKEGYEVRIL TADRDLYQLV 150 151SDRVAVLHPE GHLITPEWLW QKYGLKPEQW VDFRALVGDP SDNLPGVKGI 200 201GEKTALKLLK EWGSLENLLK NLDRVKPENV REKIKAHLED LRLSLELSRV 250 251RTDLPLEVDL AQGREPDREG LRAFLERLEF GSLLHEFGLL EAPAPLEEAP 300 301WPPPEGAFVG FVLSRPEPMW AELKALAACR DGRVHRAADP LAGLKDLKEV 350 351RGLLAKDLAV LASREGLDLV PGDDPMLLAY LLDPSNTTPE GVARRYGGEW 400 401TEDAAHRALL SERLHRNLLK RLQGEEKLLW LYHEVEKPLS RVLAHMEATG 450 451VRLDVAYLQA LSLELAEEIR RLEEEVFRLA GHPFNLNSRD QLERVLFDEL 500 501RLPALGKTQK TGKRSTSAAV LEALREAHPI VEKILQHREL TKLKNTYVDP 550 551LPSLVHPNTG RLHTRFNQTA TATGRLSSSD PNLQNIPVRT PLGQRIRRAF 600 601VAEAGWALVA LDYSQIELRV LAHLSGDENL IRVFQEGKDI HTQTASWMFG 650 651VPPEAVDPLM RRAAKTVN Y G VLYGMSAHRL SQELAIPYEE AVAFIERYFQ 700 701SFPKVRAWIE KTLEEGRKRG YVETLFGRRR YVPDLNARVK SVREAAERMA 750 751FNMPVQGTAA DLMKLAMVKL FPRLREMGAR MLLQVHDELL LEAPQARAEE 800 801VAALAKEAME KAYPLAVPLE VEVGMGEDWL SAKG 834

In another embodiment, the invention provides a polypeptide of SEQ IDNO:23, a derivative Thermus thermophilus polypeptide from strain RQ-1with reduced 5′-3′ exonuclease activity and reduced bias against ddNTPincorporation. SEQ ID NO:23 has Asp in place of Gly at position 46 andTyr in place of Phe at position 669. The sequence of SEQ ID NO:23 isbelow.

1 MEAMLPLFEP KGRVLLVDGH HLAYRTFFAL KGLTTSRGEP VQAVY G FAKS 50 51LLKALKEDGY KAVFVVFDAK APSFRHEAYE AYKAGRAPTP EDFPRQLALI 100 101KELVDLLGFT RLEVPGFEAD DVLATLAKKA EKEGYEVRIL TADRDLYQLV 150 151SDRVAVLHPE GHLITPEWLW EKYGLRPEQW VDFRALVGDP SDNLPGVKGI 200 201GEKTALKLLK EWGSLENLLK NLDRVKPESV REKIKAHLED LRLSLELSRV 250 251RTDLPLEVDL AQGREPDREG LRAFLERLEF GSLLHEFGLL EAPAPLEEAP 300 301WPPPEGAFVG FVLSRPEPMW AELKALAACR DGRVHRAEDP LAGLKDLKEV 350 351RGLLAKDLAV LASREGLDLV PGDDPMLLAY LLDPSNTTPE GVARRYGGEW 400 401TEDAAQRALL SERLQQNLLK RLQGEEKLLW LYHEVEKPLS RVLAHMEATG 450 451VRLDVAYLQA LSLELAEEIR RLEEEVFRLA GHPFNLNSRD QLERVLFDEL 500 501RLPALGKTQK TGKRSTSAAV LEALREAHPI VEKILQHREL TKLKNTYVDP 550 551LPSLVHPRTG RLHTRFNQTA TATGRLSSSD PNLQNIPVRT PLGQRIRRAF 600 601VAEAGWALVA LDYSQIELRV LAHLSGDENL IRVFQEGKDI HTQTASWMFG 650 651VPPEAVDPLM RRAAKTVN Y G VLYGMSAHRL SQELSIPYEE AVAFIERYFQ 700 701SFPKVRAWIE KTLEEGRKRG YVETLFGRRR YVPDLNARVK SVREAAERMA 750 751FNMPVQGTAA DLMKLAMVKL FPRLREMGAR MLLQVHDELL LEAPQARAEE 800 801VAALAKEAME KAYPLAVPLE VEVGIGEDWL SAKG 834

In another embodiment, the invention provides a polypeptide of SEQ IDNO:24, a derivative Thermus thermophilus polypeptide from strain 1b21with reduced 5′-3′ exonuclease activity and reduced bias against ddNTPincorporation. SEQ ID NO:24 has Asp in place of Gly at position 46 andTyr in place of Phe at position 669. The sequence of SEQ ID NO:24 isbelow.

1 MEAMLPLFEP KGRVLLVDGH HLAYRTFFAL KGLTTSRGEP VQAVY D FAKS 50 51LLKALKEDGY KAVFVVFDAK APSFRHEAYE AYKAGRAPTP EDFPRQLALI 100 101KELVDLLGFT RLEVPGYEAD DVLATLAKKA EKEGYEVRIL TADRDLYQLV 150 151SDRVAVLHPE GHLITPEWLW EKYGLKPEQW VDFRALVGDP SDNLPGVKGI 200 201GEKTALKLLK EWGSLENLLK NLDRVKPENV REKIKAHLED LRLSLELSRV 250 251RTDLPLEVDL AQGREPDREG LRAFLERLEF GSLLHEFGLL EAPAPLEEAP 300 301WPPPEGAFVG FVLSRPEPMW AELKALAACR DGRVHRAADP LAGLKDLKEV 351 351RGLLAKDLAV LASREGLDLV PGDDPMLLAY LLDPSNTTPE GVARRYGGEW 400 401TEDAAHRALL SERLHRNLLK RLEGEEKLLW LYHEVEKPLS RVLAHMEATG 450 451VRLDVAYLQA LSLELAEEIR RLEEEVFRLA GHPFNLNSRD QLERVLFDEL 500 501RLPALGKTQK TGKRSTSAAV LEALREAHPI VEKILQHREL TKLKNTYVDP 550 551LPSLVHPRTG RLHTRFNQTA TATGRLSSSD PNLQNIPVRT PLGQRIRRAF 600 601VAEAGWALVA LDYSQIELRV LAHLSGDENL IRVFQEGKDI HTQTASWMFG 650 651VPPEAVDPLM RRAAKTVN Y G VLYGMSAHRL SQELAIPYEE AVAFIERYFQ 700 701SFPKVRAWIE KTLEEGRKRG YVETLFGRRR YVPDLNARVK SVREAAERMA 750 751FNMPVQGTAA DLMKLAMVKL FPRLREMGAR MLLQVHDELL LEAPQARAEE 800 801VAALAKEAME KAYPLAVPLE VEVGMGEDWL SAKG 834

The nucleic acid polymerase polypeptides of the invention have homologyto portions of the amino acid sequences of the thermostable DNApolymerases from other strains of Thermus thermophilus. However, severalportions of the amino acid sequences of the present invention aredistinct (see FIGS. 1A and 1B and FIGS. 2A, 2B, and 2C).

As indicated above, derivative and variant polypeptides of the inventionare derived from the wild type nucleic acid polymerase by deletion oraddition of one or more amino acids to the N-terminal and/or C-terminalend of the wild type polypeptide; deletion or addition of one or moreamino acids at one or more sites within the wild type polypeptide; orsubstitution of one or more amino acids at one or more sites within thewild type polypeptide. Thus, the polypeptides of the invention may bealtered in various ways including amino acid substitutions, deletions,truncations, and insertions.

Such variant and derivative polypeptides may result, for example, fromgenetic polymorphism or from human manipulation. Methods for suchmanipulations are generally known in the art. For example, amino acidsequence variants of the polypeptides can be prepared by mutations inthe DNA. Methods for mutagenesis and nucleotide sequence alterations arewell known in the art. See, for example, Kunkel, Proc. Natl. Acad. Sci.USA, 82, 488 (1985); Kunkel et al., Methods in Enzymol., 154, 367(1987); U.S. Pat. No. 4,873,192; Walker and Gaastra, eds., Techniques inMolecular Biology, MacMillan Publishing Company, New York (1983) and thereferences cited therein. Guidance as to appropriate amino acidsubstitutions that do not affect biological activity of the protein ofinterest may be found in the model of Dayhoff et al., Atlas of ProteinSequence and Structure, Natl. Biomed. Res. Found., Washington, C.D.(1978), herein incorporated by reference.

The derivatives and variants of the isolated polypeptides of theinvention have identity with at least about 98% of the amino acidpositions of any one of SEQ ID NO:13-24 and have nucleic acid polymeraseactivity and/or are thermally stable. In a preferred embodiment,polypeptide derivatives and variants have identity with at least about99% of the amino acid positions of any one of SEQ ID NO:13-24 and havenucleic acid polymerase activity and/or are thermally stable

Amino acid residues of the isolated polypeptides and polypeptidederivatives and variants can be genetically encoded L-amino acids,naturally occurring non-genetically encoded L-amino acids, syntheticL-amino acids or D-enantiomers of any of the above. The amino acidnotations used herein for the twenty genetically encoded L-amino acidsand common non-encoded amino acids are conventional and are as shown inTable 2.

TABLE 2 One-Letter Common Amino Acid Symbol Abbreviation Alanine A AlaArginine R Arg Asparagine N Asn Aspartic acid D Asp Cysteine C CysGlutamine Q Gln Glutamic acid E Glu Glycine G Gly Histidine H HisIsoleucine I Ile Leucine L Leu Lysine K Lys Methionine M MetPhenylalanine F Phe Proline P Pro Serine S Ser Threonine T ThrTryptophan W Trp Tyrosine Y Tyr Valine V Val β-Alanine Bala2,3-Diaminopropionic acid Dpr α-Aminoisobutyric acid Aib N-MethylglycineMeGly (sarcosine) Ornithine Orn Citrulline Cit t-Butylalanine t-BuAt-Butylglycine t-BuG N-methylisoleucine MeIle Phenylglycine PhgCyclohexylalanine Cha Norleucine Nle Naphthylalanine Nal Pyridylalanine3-Benzothienyl alanine 4-Chlorophenylalanine Phe(4-Cl)2-Fluorophenylalanine Phe(2-F) 3-Fluorophenylalanine Phe(3-F)4-Fluorophenylalanine Phe(4-F) Penicillamine Pen 1,2,3,4-Tetrahydro- Ticisoquinoline-3-carboxylic acid β-2-thienylalanine Thi Methioninesulfoxide MSO Homoarginine Harg N-acetyl lysine AcLys 2,4-Diaminobutyric acid Dbu ρ-Aminophenylalanine Phe(pNH₂) N-methylvaline MeValHomocysteine Hcys Homoserine Hser ε-Amino hexanoic acid Aha δ-Aminovaleric acid Ava 2,3-Diaminobutyric acid Dab

Polypeptide variants that are encompassed within the scope of theinvention can have one or more amino acids substituted with an aminoacid of similar chemical and/or physical properties, so long as thesevariant polypeptides retain polymerase activity and/or remain thermallystable. Derivative polypeptides can have one or more amino acidssubstituted with amino acids having different chemical and/or physicalproperties, so long as these variant polypeptides retain polymeraseactivity and/or remain thermally stable.

Amino acids that are substitutable for each other in the present variantpolypeptides generally reside within similar classes or subclasses. Asknown to one of skill in the art, amino acids can be placed into threemain classes: hydrophilic amino acids, hydrophobic amino acids andcysteine-like amino acids, depending primarily on the characteristics ofthe amino acid side chain. These main classes may be further dividedinto subclasses. Hydrophilic amino acids include amino acids havingacidic, basic or polar side chains and hydrophobic amino acids includeamino acids having aromatic or apolar side chains. Apolar amino acidsmay be further subdivided to include, among others, aliphatic aminoacids. The definitions of the classes of amino acids as used herein areas follows:

“Hydrophobic Amino Acid” refers to an amino acid having a side chainthat is uncharged at physiological pH and that is repelled by aqueoussolution. Examples of genetically encoded hydrophobic amino acidsinclude Ile, Leu and Val. Examples of non-genetically encodedhydrophobic amino acids include t-BuA.

“Aromatic Amino Acid” refers to a hydrophobic amino acid having a sidechain containing at least one ring having a conjugated 7-electron system(aromatic group). The aromatic group may be further substituted withsubstituent groups such as alkyl, alkenyl, alkynyl, hydroxyl, sulfonyl,nitro and amino groups, as well as others. Examples of geneticallyencoded aromatic amino acids include phenylalanine, tyrosine andtryptophan. Commonly encountered non-genetically encoded aromatic aminoacids include phenylglycine, 2-naphthylalanine, β-2-thienylalanine,1,2,3,4-tetrahydroisoquinoline-3-carboxylic acid, 4-chlorophenylalanine,2-fluorophenylalanine, 3-fluorophenylalanine and 4-fluorophenylalanine.

“Apolar Amino Acid” refers to a hydrophobic amino acid having a sidechain that is generally uncharged at physiological pH and that is notpolar. Examples of genetically encoded apolar amino acids includeglycine, proline and methionine. Examples of non-encoded apolar aminoacids include Cha.

“Aliphatic Amino Acid” refers to an apolar amino acid having a saturatedor unsaturated straight chain, branched or cyclic hydrocarbon sidechain. Examples of genetically encoded aliphatic amino acids includeAla, Leu, Val and Ile. Examples of non-encoded aliphatic amino acidsinclude Nle.

“Hydrophilic Amino Acid” refers to an amino acid having a side chainthat is attracted by aqueous solution. Examples of genetically encodedhydrophilic amino acids include Ser and Lys. Examples of non-encodedhydrophilic amino acids include Cit and hCys.

“Acidic Amino Acid” refers to a hydrophilic amino acid having a sidechain pK value of less than 7. Acidic amino acids typically havenegatively charged side chains at physiological pH due to loss of ahydrogen ion. Examples of genetically encoded acidic amino acids includeaspartic acid (aspartate) and glutamic acid (glutamate).

“Basic Amino Acid” refers to a hydrophilic amino acid having a sidechain pK value of greater than 7. Basic amino acids typically havepositively charged side chains at physiological pH due to associationwith hydronium ion. Examples of genetically encoded basic amino acidsinclude arginine, lysine and histidine. Examples of non-geneticallyencoded basic amino acids include the non-cyclic amino acids ornithine,2,3-diaminopropionic acid, 2,4-diaminobutyric acid and homoarginine.

“Polar Amino Acid” refers to a hydrophilic amino acid having a sidechain that is uncharged at physiological pH, but which has a bond inwhich the pair of electrons shared in common by two atoms is held moreclosely by one of the atoms. Examples of genetically encoded polar aminoacids include asparagine and glutamine. Examples of non-geneticallyencoded polar amino acids include citrulline, N-acetyl lysine andmethionine sulfoxide.

“Cysteine-Like Amino Acid” refers to an amino acid having a side chaincapable of forming a covalent linkage with a side chain of another aminoacid residue, such as a disulfide linkage. Typically, cysteine-likeamino acids generally have a side chain containing at least one thiol(SH) group. Examples of genetically encoded cysteine-like amino acidsinclude cysteine. Examples of non-genetically encoded cysteine-likeamino acids include homocysteine and penicillamine.

As will be appreciated by those having skill in the art, the aboveclassifications are not absolute. Several amino acids exhibit more thanone characteristic property, and can therefore be included in more thanone category. For example, tyrosine has both an aromatic ring and apolar hydroxyl group. Thus, tyrosine has dual properties and can beincluded in both the aromatic and polar categories. Similarly, inaddition to being able to form disulfide linkages, cysteine also hasapolar character. Thus, while not strictly classified as a hydrophobicor apolar amino acid, in many instances cysteine can be used to conferhydrophobicity to a polypeptide.

Certain commonly encountered amino acids that are not geneticallyencoded and that can be present, or substituted for an amino acid, inthe variant polypeptides of the invention include, but are not limitedto, β-alanine (b-Ala) and other omega-amino acids such as3-aminopropionic acid (Dap), 2,3-diaminopropionic acid (Dpr),4-aminobutyric acid and so forth; α-aminoisobutyric acid (Aib);ε-aminohexanoic acid (Aha); δ-aminovaleric acid (Ava); N-methylglycine(MeGly); ornithine (Orn); citrulline (Cit); t-butylalanine (t-BuA);t-butylglycine (t-BuG); N-methylisoleucine (Melle); phenylglycine (Phg);cyclohexylalanine (Cha); norleucine (Nle); 2-naphthylalanine (2-Nal);4-chlorophenylalanine (Phe(4-Cl)); 2-fluorophenylalanine (Phe(2-F));3-fluorophenylalanine (Phe(3-F)); 4-fluorophenylalanine (Phe(4-F));penicillamine (Pen); 1,2,3,4-tetrahydroisoquinoline-3-carboxylic acid(Tic); .beta.-2-thienylalanine (Thi); methionine sulfoxide (MSO);homoarginine (hArg); N-acetyl lysine (AcLys); 2,3-diaminobutyric acid(Dab); 2,3-diaminobutyric acid (Dbu); p-aminophenylalanine (Phe(pNH₂));N-methyl valine (MeVal); homocysteine (hCys) and homoserine (hSer).These amino acids also fall into the categories defined above.

The classifications of the above-described genetically encoded andnon-encoded amino acids are summarized in Table 3, below. It is to beunderstood that Table 3 is for illustrative purposes only and does notpurport to be an exhaustive list of amino acid residues that maycomprise the variant and derivative polypeptides described herein. Otheramino acid residues that are useful for making the variant andderivative polypeptides described herein can be found, e.g., in Fasman,1989, CRC Practical Handbook of Biochemistry and Molecular Biology, CRCPress, Inc., and the references cited therein. Amino acids notspecifically mentioned herein can be conveniently classified into theabove-described categories on the basis of known behavior and/or theircharacteristic chemical and/or physical properties as compared withamino acids specifically identified.

TABLE 3 Genetically Classification Encoded Genetically Non-EncodedHydrophobic F, L, I, V Aromatic F, Y, W Phg, Nal, Thi, Tic, Phe(4-Cl),Phe(2-F), Phe(3-F), Phe(4-F), Pyridyl Ala, Benzothienyl Ala Apolar M, G,P Aliphatic A, V, L, I t-BuA, t-BuG, MeIle, Nle, MeVal, Cha, bAla,MeGly, Aib Hydrophilic S, K Cit, hCys Acidic D, E Basic H, K, R Dpr,Orn, hArg, Phe(p-NH₂), DBU, A₂ BU Polar Q, N, S, T, Y Cit, AcLys, MSO,hSer Cysteine-Like C Pen, hCys, β-methyl Cys

Polypeptides of the invention can have any amino acid substituted by anysimilarly classified amino acid to create a variant peptide, so long asthe peptide variant is thermally stable and/or retains DNA polymeraseactivity.

“Domain shuffling” or construction of “thermostable chimeric nucleicacid polymerases” may be used to provide thermostable polymerasescontaining novel properties. For example, placement of codons 289-422from one of the present Thermus thermophilus polymerase coding sequencesafter codons 1-288 of the Thermus aquaticus DNA polymerase would yield anovel thermostable nucleic acid polymerase containing the 5′-3′exonuclease domain of Thermus aquaticus DNA polymerase (1-289), the3′-5′ exonuclease domain of Thermus thermophilus nucleic acid polymerase(289-422), and the DNA polymerase domain of Thermus aquaticus DNApolymerase (423-832). Alternatively, the 5′-3′ exonuclease domain andthe 3′-5′ exonuclease domain of one of the present Thermus thermophilusnucleic acid polymerases may be fused to the DNA polymerase (dNTPbinding and primer/template binding domains) portion of Thermusaquaticus DNA polymerase (about codons 423-832). The donors andrecipients need not be limited to Thermus aquaticus and Thermusthermophilus polymerases. The Thermus thermophilus polymerase, 3′-5′exonuclease, 5′-3′ exonuclease and/or other domains can similarly beexchanged for those from other species of Thermus.

It has been demonstrated that the exonuclease domain of Thermusaquaticus Polymerase I can be removed from the amino terminus of theprotein with out a significant loss of thermostability or polymeraseactivity (Erlich et al., (1991) Science 252: 1643-1651, Barnes, W. M.,(1992) Gene 112:29-35., Lawyer et al., (1989) JBC 264:6427-6437). OtherN-terminal deletions similarly have been shown to maintainthermostability and activity (Vainshtein et al., (1996) Protein Science5:1785-1792 and references therein.) Therefore this invention alsoincludes similarly truncated forms of any of the wild type or variantpolymerases provided herein. For example, the invention is also directedto an active truncated variant of any of the polymerases provided by theinvention in which the first 330 amino acids are removed.

Moreover, the invention provides SEQ ID NO:29, a truncated form of apolymerase in which the N-terminal 289 amino acids have been removedfrom the wild type Thermus thermophilus polymerase from strain GK24.

290 L EAPAPLEEAP 300 301 WPPPEGAFVG FVLSRPEPMW AELKALAACR DGRVHRAADPLAGLKDLKEV 350 351 RGLLAKDLAV LASREGLDLV PGDDPMLLAY LLDPSNTTPEGVARRYGGEW 400 401 TEDAAHRALL SERLHRNLLK RLQGEEKLLW LYHEVEKPLSRVLAHMEATG 450 451 VRLDVAYLQA LSLELAEEIR RLEEEVFRLA GHPFNLNSRDQLERVLFDEL 500 501 RLPALGKTQK TGKRSTSAAV LEALREAHPI VEKILQHRELTKLKNTYVDP 550 551 LPSLVHPNTG RLHTRFNQTA TATGRLSSSD PNLQNIPVRTPLGQRIRRAF 600 601 VAEAGWALVA LDYSQIELRV LAHLSGDENL IRVFQEGKDIHTQTASWMFG 650 651 VPPEAVDPLM RRAAKTVNFG VLYGMSAHRL SQELAIPYEEAVAFIERYFQ 700 701 SFPKVRAWIE KTLEEGRKRG YVETLFGRRR YVPDLNARVKSVREAAERMA 750 751 FNMPVQGTAA DLMKLAMVKL FPRLREMGAR MLLQVHDELLLEAPQARAEE 800 801 VAALAKEAME KAYPLAVPLE VEVGMGEDWL SAKG 834

In another embodiment, the invention provides SEQ ID NO:30, a truncatedform of a polymerase in which the N-terminal 289 amino acids have beenremoved from the wild type Thermus thermophilus polymerase from strainRQ-1. SEQ ID NO:30 is provided below.

290 L EAPAPLEEAP 300 301 WPPPEGAFVG FVLSRPEPMW AELKALAACR DGRVHRAEDPLAGLKDLKEV 350 351 RGLLAKDLAV LASREGLDLV PGDDPMLLAY LLDPSNTTPEGVARRYGGEW 400 401 TEDAAQRALL SERLQQNLLK RLQGEEKLLW LYHEVEKPLSRVLAHMEATG 450 451 VRLDVAYLQA LSLELAEEIR RLEEEVFRLA GHPFNLNSRDQLERVLFDEL 500 501 RLPALGKTQK TGKRSTSAAV LEALREAHPI VEKILQHRELTKLKNTYVDP 550 551 LPSLVHPRTG RLHTRFNQTA TATGRLSSSD PNLQNIPVRTPLGQRIRRAF 600 601 VAEAGWALVA LDYSQIELRV LAHLSGDENL IRVFQEGKDIHTQTASWMFG 650 651 VPPEAVDPLM RRAAKTVNFG VLYGMSAHRL SQELSIPYEEAVAFIERYFQ 700 701 SFPKVRAWIE KTLEEGRKRG YVETLFGRRR YVPDLNARVKSVREAAERMA 750 751 FNMPVQGTAA DLMKLAMVKL FPRLREMGAR MLLQVHDELLLEAPQARAEE 800 801 VAALAKEAME KAYPLAVPLE VEVGIGEDWL SAKG 834

In another embodiment, the invention provides SEQ ID NO:30, a truncatedform of a polymerase in which the N-terminal 289 amino acids have beenremoved from the wild type Thermus thermophilus polymerase from strain1b21. SEQ ID NO:30 is provided below.

290 L EAPAPLEEAP 300 301 WPPPEGAFVG FVLSRPEPMW AELKALAACR DGRVHRAADPLAGLKDLKEV 351 351 RGLLAKDLAV LASREGLDLV PGDDPMLLAY LLDPSNTTPEGVARRYGGEW 400 401 TEDAAHRALL SERLHRNLLK RLEGEEKLLW LYHEVEKPLSRVLAHMEATG 450 451 VRLDVAYLQA LSLELAEEIR RLEEEVFRLA GHPFNLNSRDQLERVLFDEL 500 501 RLPALGKTQK TGKRSTSAAV LEALREAHPI VEKILQHRELTKLKNTYVDP 550 551 LPSLVHPRTG RLHTRFNQTA TATGRLSSSD PNLQNIPVRTPLGQRIRRAF 600 601 VAEAGWALVA LDYSQIELRV LAHLSGDENL IRVFQEGKDIHTQTASWMFG 650 651 VPPEAVDPLM RRAAKTVNFG VLYGMSAHRL SQELAIPYEEAVAFIERYFQ 700 701 SFPKVRAWIE KTLEEGRKRG YVETLFGRRR YVPDLNARVKSVREAAERMA 750 751 FNMPVQGTAA DLMKLAMVKL FPRLREMGAR MLLQVHDELLLEAPQARAEE 800 801 VAALAKEAME KAYPLAVPLE VEVGMGEDWL SAKG 834

Thus, the polypeptides of the invention encompass both naturallyoccurring proteins as well as variations, truncations and modified formsthereof. Such variants will continue to possess the desired activity.The deletions, insertions, and substitutions of the polypeptide sequenceencompassed herein are not expected to produce radical changes in thecharacteristics of the polypeptide. One skilled in the art can readilyevaluate the thermal stability and polymerase activity of thepolypeptides and variant polypeptides of the invention by routinescreening assays.

Kits and compositions containing the present polypeptides aresubstantially free of cellular material. Such preparations andcompositions have less than about 30%, 20%, 10%, or 5%, (by dry weight)of contaminating bacterial cellular protein.

The activity of nucleic acid polymerase polypeptides and variantpolypeptides can be assessed by any procedure known to one of skill inthe art. For example, the DNA synthetic activity of the variant andnon-variant polymerase polypeptides of the invention can be tested instandard DNA sequencing or DNA primer extension reaction. One such assaycan be performed in a 100 μl (final volume) reaction mixture,containing, for example, 0.1 mM dCTP, dTTP, dGTP, α-³²P-dATP, 0.3 mg/mlactivated calf thymus DNA and 0.5 mg/ml BSA in a buffer containing: 50mM KCl, 1 mM DTT, 10 mM MgCl₂ and 50 mM of a buffering compound such asPIPES, Tris or Triethylamine. A dilution to 0.1 units/pi of eachpolymerase enzyme is prepared, and 5 μl of such a dilution is added tothe reaction mixture, followed by incubation at 60° C. for 10 minutes.Reaction products can be detected by determining the amount of ³²Pincorporated into DNA or by observing the products after separation on apolyacrylamide gel.

Uses for Nucleic Acid Polymerase Polypeptides

The thermostable enzymes of this invention may be used for any purposein which DNA Polymerase or reverse transcriptase activity is necessaryor desired. For example, the present nucleic acid polymerasepolypeptides can be used in one or more of the following procedures: DNAsequencing, DNA amplification, RNA amplification, reverse transcription,DNA synthesis and/or primer extension. The nucleic acid polymerasepolypeptides of the invention can be used to amplify DNA by polymerasechain reaction (PCR). The nucleic acid polymerase polypeptides of theinvention can be used to sequence DNA by Sanger sequencing procedures.The nucleic acid polymerase polypeptides of the invention can also beused in primer extension reactions. The nucleic acid polymerasepolypeptides of the invention can also be used for reversetranscription. The nucleic acid polymerase polypeptides of the inventioncan be used test for single nucleotide polymorphisms (SNPs) by singlenucleotide primer extension using terminator nucleotides. Any suchprocedures and related procedures, for example, polynucleotide or primerlabeling, minisequencing and the like are contemplated for use with thepresent nucleic acid polymerase polypeptides.

Methods of the invention comprise the step of extending a primedpolynucleotide template with at least one labeled nucleotide, whereinthe extension is catalyzed by a nucleic acid polymerase of theinvention. Nucleic acid polymerases used for Sanger sequencing canproduce fluorescently labeled products that are analyzed on an automatedfluorescence-based sequencing apparatus such as an Applied Biosystems310 or 377 (Applied Biosystems, Foster City, Calif.). Detailed protocolsfor Sanger sequencing are known to those skilled in the art and may befound, for example in Sambrook et al, Molecular Cloning, A LaboratoryManual, Second Edition, Cold Spring Harbor Press, Cold Spring Harbor,N.Y. (1989).

In one embodiment, the nucleic acid polymerase polypeptides of theinvention are used for DNA amplification. Any procedure that employs aDNA polymerase can be used, for example, in polymerase chain reaction(PCR) assays, strand displacement amplification and other amplificationprocedures. Strand displacement amplification can be used as describedin Walker et al (1992) Nucl. Acids Res. 20, 1691-1696. The term“polymerase chain reaction” (“PCR”) refers to the method of K. B. MullisU.S. Pat. Nos. 4,683,195; 4,683,202; and 4,965,188, hereby incorporatedby reference, which describe a method for increasing the concentrationof a segment of a target sequence in a mixture of genomic DNA or otherDNA or RNA without cloning or purification.

The PCR process for amplifying a target sequence consists of introducinga large excess of two oligonucleotide primers to the DNA mixturecontaining the desired target sequence, followed by a precise sequenceof thermal cycling in the presence of a nucleic acid polymerase. The twoprimers are complementary to their respective strands of the doublestranded target sequence. To effect amplification, the mixture isdenatured and the primers annealed to their complementary sequenceswithin the target molecule. Following annealing, the primers areextended with a polymerase so as to form a new pair of complementarystrands. The steps of denaturation, primer annealing and polymeraseextension can be repeated many times. Each round of denaturation,annealing and extension constitutes one “cycle.” There can be numerouscycles, and the amount of amplified DNA produced increases with thenumber of cycles. Hence, to obtain a high concentration of an amplifiedtarget nucleic acid, many cycles are performed.

The steps involve in PCR nucleic acid amplification method are describedin more detail below. For ease of discussion, the nucleic acid to beamplified is described as being double-stranded. However, the process isequally useful for amplifying a single-stranded nucleic acid, such as anmRNA, although the ultimate product is generally double-stranded DNA. Inthe amplification of a single-stranded nucleic acid, the first stepinvolves the synthesis of a complementary strand (one of the twoamplification primers can be used for this purpose), and the succeedingsteps proceed as follows:

Each nucleic acid strand is contacted with four different nucleosidetriphosphates and one oligonucleotide primer for each nucleic acidstrand to be amplified, wherein each primer is selected to besubstantially complementary to a portion the nucleic acid strand to beamplified, such that the extension product synthesized from one primer,when it is separated from its complement, can serve as a template forsynthesis of the extension product of the other primer. To promote theproper annealing of primer(s) and the nucleic acid strands to beamplified, a temperature that allows hybridization of each primer to acomplementary nucleic acid strand is used.

After primer annealing, a nucleic acid polymerase is used for primerextension that incorporates the nucleoside triphosphates into a growingnucleic acid strand that is complementary to the strand hybridized bythe primer. In general, this primer extension reaction is performed at atemperature and for a time effective to promote the activity of theenzyme and to synthesize a “full length” complementary nucleic acidstrand, that extends into a through a complete second primer binding.However, the temperature is not so high as to separate each extensionproduct from its nucleic acid template strand.

The mixture from step (b) is then heated for a time and at a temperaturesufficient to separate the primer extension products from theircomplementary templates. The temperature chosen is not so high as toirreversibly denature the nucleic acid polymerase present in themixture.

The mixture from (c) is cooled for a time and at a temperature effectiveto promote hybridization of a primer to each of the single-strandedmolecules produced in step (b).

The mixture from step (d) is maintained at a temperature and for a timesufficient to promote primer extension by DNA polymerase to produce a“full length” extension product. The temperature used is not so high asto separate each extension product from the complementary strandtemplate. Steps (c)-(e) are repeated until the desired level ofamplification is obtained.

The amplification method is useful not only for producing large amountsof a specific nucleic acid sequence of known sequence but also forproducing nucleic acid sequences that are known to exist but are notcompletely specified. One need know only a sufficient number of bases atboth ends of the sequence in sufficient detail so that twooligonucleotide primers can be prepared that will hybridize to differentstrands of the desired sequence at relative positions along the sequencesuch that an extension product synthesized from one primer, whenseparated from the template (complement), can serve as a template forextension of the other primer. The greater the knowledge about the basesat both ends of the sequence, the greater can be the specificity of theprimers for the target nucleic acid sequence.

Thermally stable nucleic acid polymerases are therefore generally usedfor PCR because they can function at the high temperatures used formelting double stranded target DNA and annealing the primers during eachcycle of the PCR reaction. High temperature results in thermodynamicconditions that favor primer hybridization with the target sequences andnot hybridization with non-target sequences (H. A. Erlich (ed.), PCRTechnology, Stockton Press [1989]).

The thermostable nucleic acid polymerases of the present inventionsatisfy the requirements for effective use in amplification reactionssuch as PCR. The present polymerases do not become irreversiblydenatured (inactivated) when subjected to the required elevatedtemperatures for the time necessary to melt double-stranded nucleicacids during the amplification process. Irreversible denaturation forpurposes herein refers to permanent and complete loss of enzymaticactivity. The heating conditions necessary for nucleic acid denaturationwill depend, e.g., on the buffer salt concentration and the compositionand length of the nucleic acids being denatured, but typicallydenaturation can be done at temperatures ranging from about 90° C. toabout 105° C. The time required for denaturation depends mainly on thetemperature and the length of the duplex nucleic acid. Typically thetime needed for denaturation ranges from a few seconds up to fourminutes. Higher temperatures may be required as the salt concentrationof the buffer, or the length and/or GC composition of the nucleic acidis increased. The nucleic acid polymerases of the invention do notbecome irreversibly denatured for relatively short exposures totemperatures of about 90° C. to 100° C.

The thermostable polymerases of the invention have an optimumtemperature at which they function that is higher than about 45° C.Temperatures below 45° C. facilitate hybridization of primer totemplate, but depending on salt composition and concentration and primercomposition and length, hybridization of primer to template can occur athigher temperatures (e.g., 45° C. to 70° C.), which may promotespecificity of the primer hybridization reaction. The polymerases of theinvention exhibit activity over a broad temperature range from about 37°C. to about 90° C.

The present polymerases have particular utility for PCR not only becauseof their thermal stability but also because of their ability tosynthesize DNA using an RNA template and because of their fidelity inreplicating the template nucleic acid. In most PCR reactions that startwith an RNA template, reverse transcriptase must be added. However, useof reverse transcriptase has certain drawbacks. First, it is not stableat higher temperatures. Hence, once the initial complementary DNA (cDNA)has been made by reverse transcriptase and the thermal cycles of PCR arestarted, the original RNA template is not used as a template in theamplification reaction. Second, reverse transcriptase does not produce acDNA copy with particularly good sequence fidelity. With PCR, it ispossible to amplify a single copy of a specific target or templatenucleic acid to a level detectable by several different methodologies.However, if the sequence of the target nucleic acid is not replicatedwith fidelity, then the amplified product can include a pool of nucleicacids with diverse sequences. Hence, the nucleic acid polymerases of theinvention that can accurately reverse transcribe RNA and replicate thesequence of the template RNA or DNA with high fidelity is highlydesirable.

Any nucleic acid can act as a “target nucleic acid” for the PCR methodsof the invention. The term “target,” when used in reference to thepolymerase chain reaction, refers to the region of nucleic acid boundedby the primers used for polymerase chain reaction. In addition togenomic DNA and mRNA, any cDNA, RNA, oligonucleotide or polynucleotidecan be amplified with the appropriate set of primer molecules. Inparticular, the amplified segments created by the PCR process itselfare, themselves, efficient templates for subsequent PCR amplifications.The length of the amplified segment of the desired target sequence isdetermined by the relative positions of the primers with respect to eachother, and therefore, this length is readily controlled.

The amplified target nucleic acid can be detected by any method known toone of skill in the art. For example, target nucleic acids are oftenamplified to such an extent that they form a band visible on a sizeseparation gel. Target nucleic acids can also be detected byhybridization with a labeled probe; by incorporation of biotinylatedprimers during PCR followed by avidin-enzyme conjugate detection; byincorporation of ³²P-labeled deoxynucleotide triphosphates during PCR,and the like.

The amount of amplification can also be monitored, for example, by useof a reporter-quencher oligonucleotide as described in U.S. Pat. No.5,723,591, and a nucleic acid polymerase of the invention that has 5′-3′nuclease activity. The reporter-quencher oligonucleotide has an attachedreporter molecule and an attached quencher molecule that is capable ofquenching the fluorescence of the reporter molecule when the two are inproximity. Quenching occurs when the reporter-quencher oligonucleotideis not hybridized to a complementary nucleic acid because the reportermolecule and the quencher molecule tend to be in proximity or at anoptimal distance for quenching. When hybridized, the reporter-quencheroligonucleotide emits more fluorescence than when unhybridized becausethe reporter molecule and the quencher molecule tend to be furtherapart. To monitor amplification, the reporter-quencher oligonucleotideis designed to hybridize 3′ to an amplification primer. Duringamplification, the 5′-3′ nuclease activity of the polymerase digests thereporter oligonucleotide probe, thereby separating the reporter moleculefrom the quencher molecule. As the amplification is conducted, thefluorescence of the reporter molecule increases. Accordingly, the amountof amplification performed can be quantified based on the increase offluorescence observed.

Oligonucleotides used for PCR primers are usually about 9 to about 75nucleotides, preferably about 17 to about 50 nucleotides in length.Preferably, an oligonucleotide for use in PCR reactions is about 40 orfewer nucleotides in length (e.g., 9, 12, 15, 18, 20, 21, 24, 27, 30,35, 40, or any number between 9 and 40). Generally specific primers areat least about 14 nucleotides in length. For optimum specificity andcost effectiveness, primers of 16-24 nucleotides in length are generallypreferred.

Those skilled in the art can readily design primers for use processessuch as PCR. For example, potential primers for nucleic acidamplification can be used as probes to determine whether the primer isselective for a single target and what conditions permit hybridizationof a primer to a target within a sample or complex mixture of nucleicacids.

The present invention also contemplates use of the present nucleic acidpolymerases in combination with other procedures or enzymes. Forexample, the polymerases can be used in combination with additionalreverse transcriptase or another DNA polymerase. See U.S. Pat. No.5,322,770, incorporated by reference herein.

In another embodiment, nucleic acid polymerases of the invention with5′-3′ exonuclease activity are used to detect target nucleic acids in aninvader-directed cleavage assay. This type of assay is described, forexample, in U.S. Pat. No. 5,994,069. It is important to note that the5′-3′ exonuclease of DNA polymerases is not really an exonuclease thatprogressively cleaves nucleotides from the 5′ end of a nucleic acid, butrather a nuclease that can cleave certain types of nucleic acidstructures to produce oligonucleotide cleavage products. Such cleavageis sometimes called structure-specific cleavage.

In general, the invader-directed cleavage assay employs at least onepair of oligonucleotides that interact with a target nucleic acid toform a cleavage structure for the 5′-3′ nuclease activity of the nucleicacid polymerase. Distinctive cleavage products are released when thecleavage structure is cleaved by the 5′-3′ nuclease activity of thepolymerase. Formation of such a target-dependent cleavage structure andthe resulting cleavage products is indicative of the presence ofspecific target nucleic acid sequences in the test sample.

Therefore, in the invader-directed cleavage procedure, the 5′-3′nuclease activity of the present polymerases is needed as well at leastone pair of oligonucleotides that interact with a target nucleic acid toform a cleavage structure for the 5′-3′ nuclease. The firstoligonucleotide, sometimes termed the “probe,” can hybridize within thetarget site but downstream of a second oligonucleotide, sometimes termedan “invader” oligonucleotide. The invader oligonucleotide can hybridizeadjacent and upstream of the probe oligonucleotide. However, the targetsites to which the probe and invader oligonucleotides hybridize overlapsuch that the 3′ segment of the invader oligonucleotide overlaps withthe 5′ segment of the probe oligonucleotide. The 5′-3′ nuclease of thepresent polymerases can cleave the probe oligonucleotide at an internalsite to produce distinctive fragments that are diagnostic of thepresence of the target nucleic acid in a sample. Further details andmethods for adapting the invader-directed cleavage assay to particularsituations can be found in U.S. Pat. No. 5,994,069.

One or more nucleotide analogs can also be used with the presentmethods, kits and with the nucleic acid polymerases. Such nucleotideanalogs can be modified or non-naturally occurring nucleotides such as7-deaza purines (i.e., 7-deaza-dATP and 7-deaza-dGTP). Nucleotideanalogs include base analogs and comprise modified forms ofdeoxyribonucleotides as well as ribonucleotides. As used herein the term“nucleotide analog” when used in reference to targets present in a PCRmixture refers to the use of nucleotides other than dATP, dGTP, dCTP anddTTP; thus, the use of dUTP (a naturally occurring dNTP) in a PCR wouldcomprise the use of a nucleotide analog in the PCR. A PCR productgenerated using dUTP, 7-deaza-dATP, 7-deaza-dGTP or any other nucleotideanalog in the reaction mixture is said to contain nucleotide analogs.

The invention also provides kits that contain at least one of thenucleic acid polymerases of the invention. Individual kits may beadapted for performing one or more of the following procedures: DNAsequencing, DNA amplification, RNA Amplification and/or primerextension. Kits of the invention comprise a DNA polymerase polypeptideof the invention and at least one nucleotide. A nucleotide provided inthe kits of the invention can be labeled or unlabeled. Kits preferablycan also contain instructions on how to perform the procedures for whichthe kits are adapted.

Optionally, the subject kit may further comprise at least one otherreagent required for performing the method the kit is adapted toperform. Examples of such additional reagents include: another unlabelednucleotide, another labeled nucleotide, a balance mixture ofnucleotides, one or more chain terminating nucleotides, one or morenucleotide analogs, buffer solution(s), magnesium solution(s), cloningvectors, restriction endonucleases, sequencing primers, reversetranscriptase, and DNA or RNA amplification primers. The reagentsincluded in the kits of the invention may be supplied in premeasuredunits so as to provide for greater precision and accuracy. Typically,kits reagents and other components are placed and contained in separatevessels. A reaction vessel, test tube, microwell tray, microtiter dishor other container can also be included in the kit. Different labels canbe used on different reagents so that each reagent can be distinguishedfrom another.

The following Examples further illustrate the invention and are notintended to limit the scope of the invention.

Example 1 Cloning of a Nucleic Acid Polymerase from the RQ-1 and GK24Strains of Thermus thermophilus Bacteria Growth and Genomic DNAIsolation.

A bacterial sample of the Thermus thermophilus strain RQ-1 (Tth RQ-1)was obtained from the German Collection of Microorganisms (DSM catalognumber 9247). The GK24 strain of Thermus thermophilus was obtained fromDr. R. A. D. Williams, Queen Mary and Westfield College, London,England. The lyophilized bacteria were revived in 4 ml of ATCC Thermusbacteria growth media 461 (Castenholtz TYE medium). The 4 ml overnightwas grown at 65° C. in a water bath orbital shaker. The 4 ml culture wastransferred to 200 ml of TYE and grown overnight at 65° C. in a waterbath orbital shaker to stationary phase. Genomic DNA was prepared fromthese bacterial strains using a Qiagen genomic DNA preparation kit(Qiagen Inc., Valencia, Calif.).

Cloning of a Nucleic Acid Polymerase Gene from the RQ-1 and GK24 Strainsof Thermus thermophilus

The forward and reverse primers were designed by analysis of 5′ and 3′terminal homologous conserved regions of the Genebank DNA sequences ofthe DNA Pol I genes from Thermus aquaticus (Taq), Thermus thermophilus(Tth), Thermus filiformis (Tfi), Thermus caldophilus, and Thermusflavus. A gene fragment of a nucleic acid polymerase from the RQ-1 andGK24 strains of Thermus thermophilus were amplified using N-terminalprimer 5′-atg gag gcg atg ctt ccg ctc ttt gaa c-3′ (SEQ ID NO:25) andC-terminal primer 5′-gtc gac taa acg gca ggg ccc ccc taa cc-3′ (SEQ IDNO:26). The following PCR reaction mixture contained 2.5 ul of 10×cPfuTurbo reaction buffer (Stratagene), 2 mM MgCl₂, 50 ng genomic DNAtemplate, 0.2 mM (each) dNTPs, 20 pmol of each primer, and 10 units ofPfu Turbo DNA polymerase (Stratagene) in a 25 μl total reaction volume.The reaction was started by adding a premix containing enzyme, MgCl₂,dNTPs, buffer and water to another premix containing primer and templatethat had been preheated at 80° C. The entire reaction mixture was thendenatured (30 s at 96° C.) followed by PCR cycling for 30 cycles (98° C.for 15 sec, 56° C. for 30 s, and 72° C. for 3 min) with a finishing step(72° C. for 6 min). This produced a 2.5 kb DNA fragment. These amplifiedfragments were purified from the PCR reaction mix using a Qiagen PCRcleanup kit (Qiagen Inc., Valencia, Calif.). The fragments were thenligated into the inducible expression vector pCR®T7 CT-TOPO®(Invitrogen, Carlsbad, Calif.). Three different polymerase clones weresequenced independently in order to rule out PCR errors, yielding thefull-length consensus sequences the RQ-1 and GK24 strains of Thermusthermophilus. The nucleic acid sequence for the GK24 strain of Thermusthermophilus is provided as SEQ ID NO:1. The nucleic acid sequence forthe RQ-1 strain of Thermus thermophilus is provided as SEQ ID NO:2. Theamino acid sequence for the GK24 polymerase has SEQ ID NO:13. The aminoacid sequence for the RQ-1 polymerase has SEQ ID NO:14.

Amino Acid Sequence Comparisons with Related Thermus thermophilusPolymerases

Comparison of the GK24 amino acid sequence (SEQ ID NO:13) with apublished GK24 DNA Pol I sequence from Kwon et al., (Mol Cells. 1997Apr. 30; 7 (2):264-71) revealed that SEQ ID NO:13 has four changesversus the Kwon sequence: Asn129→Lys, Pro130→Ala, Asp147→Tyr, andGly797→Arg. These four positions are identified in bold in FIGS. 1A and1B. In each of these four positions the Kwon GK24 DNA Pol I and apolypeptide with SEQ ID NO:13 have amino acids with dramaticallydifferent chemical properties. Asparagine (Kwon) is a polar, unchargedamino acid sidechain whereas lysine (SEQ ID NO:13) is charged and basic(N129K). Proline (Kwon) and alanine (SEQ ID NO:13) are both aliphaticbut alanine promotes helix or beta sheet formation whereas prolinesgenerally interrupt helices and sheets (Pro130→Ala, or P130A). Aspartate(Kwon) is an acidic amino acid and tyrosine (SEQ ID NO:13) is aromatic(Asp147→Tyr, or D147Y). Glycine (Kwon) is the smallest amino acid sidechain whereas arginine (SEQ ID NO:13) has the longest, most basiccharged sidechain (Gly797→Arg, or G797R).

SEQ ID NO:13 has three amino acid changes from the published sequence ofThermus thermophilus strain HB8 and twenty-two amino acid changes fromthe published sequence of Thermus thermophilus strain ZO5 (U.S. Pat. No.5,674,738). These changes can be found in the amino acid alignment shownin FIGS. 1A and 1B.

Modification of Wild-Type Thermus thermophilus, Strain RQ1 and GK24Polymerases

In order to produce a polymerase in a form suitable for dye-terminatorDNA sequencing, two substitutions were made to SEQ ID NO:1 and SEQ IDNO:2 to generate polypeptides with site-specific mutations. Themutations generated are the FS (Tabor and Richardson, 1995 PNAS 92:6339-6343; U.S. Pat. No. 5,614,365) and exo-minus mutations (see U.S.Pat. No. 5,466,591; Xu Y., Derbyshire V., Ng K., Sun X-C., Grindley N.D., Joyce C. M. (1997) J. Mol. Biol. 268, 284-302). To reduce theexonuclease activity to very low levels, the mutation Gly46→Asp, or G46Dwas introduced. To reduce the discrimination between ddNTP's and dNTP's,the mutation Phe669→Tyr, or F669Y was introduced. The G46D and F669Ymutations are widely used with the Taq Pol I for DNA sequencing.

Mutagenesis of SEQ ID NO:1 and SEQ ID NO:2 was carried out using themodified QuickChange™ (Stratagene) PCR mutagenesis protocol described inSawano & Miyawaki (2000). The mutagenized nucleic acids were resequencedcompletely to confirm the introduction of the mutations and to ensurethat no PCR errors were introduced. A nucleic acid encoding the FS,exo-version of the GK24 polymerase of the invention is provided as SEQID NO:10, with amino acid sequence SEQ ID NO:22. A nucleic acid encodingthe FS, exo-version of the RQ-1 polymerase of the invention is providedas SEQ ID NO:11, with amino acid sequence SEQ ID NO:23.

Protein Expression and Purification

Nucleic acids having SEQ ID NO:10 and SEQ ID NO:11 were separatelyinserted into the cloning vector pCR®T7 CT-TOPO® (Invitrogen, Carlsbad,Calif.) and these vectors were used to express the protein. BL21 E. colicells (Invitrogen) were transformed with the vector containing SEQ IDNO:7 or SEQ ID NO:8. The cells were grown in one liter of Terrific Broth(Maniatis) to an optical density of 1.2 OD and the protein wasoverproduced by four-hour induction with 1.0 mM IPTG. The cells wereharvested by centrifugation, washed in 50 mM Tris (pH 7.5), 5 mM EDTA,5% glycerol, 10 mM EDTA to remove growth media, and the cell pelletfrozen at −80° C.

To isolate the GK24 and RQ-1 polymerases, the cells were thawed andresuspended in 2.5 volumes (wet weight) of 50 mM Tris (pH 7.2), 400 mMNaCl, and 1 mM EDTA. The cell walls were disrupted by sonication. Theresulting E. coli cell debris was removed by centrifugation. The clearedlysate was pasteurized in a water bath (75° C., 45 min), denaturing andprecipitating the majority of the non-thermostable E. coli proteins andleaving the thermostable GK24 (SEQ ID NO:22) and RQ-1 (SEQ ID NO:23)polymerases in solution. E. coli genomic DNA was removed bycoprecipitation with 0.3% Polyethyleneimine (PEI). The cleared lysatewas then applied to two columns in series: (1) a Biorex 70 cationexchange resin which chelates excess PEI and (2) a heparin-agarosecolumn (dimensions to be provided) which retains the polymerase. Thiscolumn was washed with 5 column volumes of 20 mM Tris (pH 8.5), 5%glycerol, 100 mM NaCl, 0.1 mM EDTA, 0.05% Triton X-100 and 0.05%Tween-20 (KTA buffer). The proteins were then eluted with a 0.1 to 1.0MNaCl linear gradient. The polymerases eluted at 0.8M NaCl. The elutedpolymerases were concentrated and the buffer exchanged using a Milliporeconcentration filter (30 kD Mwt cutoff). The concentrated protein wasstored at in KTA buffer (no salt) plus 50% glycerol at −20° C. Theactivity of the polymerase was measured using a salmon sperm DNAradiometric activity assay.

Example 2 Cloning of a Nucleic Acid Polymerase from the 1b21 Strain ofThermus thermophilus Bacteria Growth and Genomic DNA Isolation.

The 1b21 strain of Thermus thermophilus used in this invention wasobtained from Dr. R. A. D. Williams, Queen Mary and Westfield College,London, England. The lyophilized bacteria were revived in 4 ml of ATCCThermus bacteria growth media 461 (Castenholtz TYE medium). The 4 mlovernight was grown at 65° C. in a water bath orbital shaker. The 4 mlculture was transferred to 200 ml of TYE and grown overnight at 65° C.in a water bath orbital shaker to stationary phase. Thermus thermophilus1b21 genomic DNA was prepared using a Qiagen genomic DNA preparation kit(Qiagen Inc., Valencia, Calif.).

Cloning of Nucleic Acids Encoding Thermus thermophilus 1b21 Polymerase

The forward and reverse primers were designed by analysis of 5′ and 3′terminal homologous conserved regions of the Genebank DNA sequences ofthe DNA Pol I genes from Thermus aquaticus (Taq), Thermus thermophilus(Tth), Thermus filiformis (Tfi), Thermus caldophilus, and Thermusflavus. A Thermus thermophilus 1b21 genomic DNA fragment encoding partof the polymerase coding region was amplified using N-terminal primer5′-atg gag gcg atg ctt ccg ctc ttt gaa c-3′ (SEQ ID NO:27) andC-terminal primer 5′-gtc gac taa acg gca ggg ccc ccc taa cc-3′ (SEQ IDNO:28). The following PCR reaction mixture contained 2.5 ul of 10×cPfuTurbo reaction buffer (Stratagene), 2 mM MgCl₂, 50 ng genomic DNAtemplate, 0.2 mM (each) dNTPs, 20 pmol of each primer, and 10 units ofPfu Turbo DNA polymerase (Stratagene) in a 25 μl total reaction volume.The reaction was started by adding a premix containing enzyme, MgCl₂,dNTPs, buffer and water to another premix containing primer and templatewhich had been preheated at 80° C. The entire reaction mixture was thendenatured (30 s, 96° C.) followed by PCR cycling for 30 cycles (98° C.for 15 sec; 56° C. for 30 sec; 72° C. for 3 min) with a finishing step(72° C. for 6 min). This produced a 2.5 kb DNA fragment. This amplifiedfragment was purified from the PCR reaction mix using a Qiagen PCRcleanup kit (Qiagen Inc., Valencia, Calif.). The fragment was thenligated into the inducible expression vector pCR®T7 CT-TOPO®(Invitrogen, Carlsbad, Calif.). Three different Thermus thermophilus1b21 genomic DNA fragments encoding the full-length gene were sequencedindependently in order to rule out PCR errors. The resulting consensussequence is SEQ ID NO:3, the nucleotide sequence for the polymeraseisolated from Thermus thermophilus, strain 1b21. The amino acid sequencefor the polymerase isolated from Thermus thermophilus, strain 1b21 isSEQ ID NO:15.

Thermus thermophilus 1b21 Polymerase Expression and Purification

A nucleic acid having SEQ ID NO:12 (containing FS and exo-mutations) wasinserted into cloning vector pCR®T7 CT-TOPO® (Invitrogen, Carlsbad,Calif.) to express the protein. BL21 E. coli cells (Invitrogen) weretransformed with this vector containing SEQ ID NO:12. The cells weregrown in one liter of Terrific Broth (Maniatis) to an optical density of1.2 OD and the protein was overproduced by four-hour induction with 1.0mM IPTG. The cells were harvested by centrifugation, washed in 50 mMTris (pH 7.5), 5 mM EDTA, 5% glycerol, 10 mM EDTA to remove growthmedia, and the cell pellet frozen at −80° C.

To isolate the Thermus thermophilus, strain 1b21 polymerase, the cellswere thawed and resuspended in 2.5 volumes (wet weight) of 50 mM Tris(pH 7.2), 400 mM NaCl, and 1 mM EDTA. The cell walls were disrupted bysonication. The resulting E. coli cell debris was removed bycentrifugation. The cleared lysate was pasteurized in a water bath (75°C., 45 min), denaturing and precipitating the majority of thenon-thermostable E. coli proteins and leaving the thermostable Thermusthermophilus, strain 1b21 polymerase in solution. E. coli genomic DNAwas removed by coprecipitation with 0.3% Polyethyleneimine (PEI). Thecleared lysate was then applied to two columns in series: (1) a Biorex70 cation exchange resin which chelates excess PEI and (2) aheparin-agarose column (dimensions to be provided) which retains thepolymerase. This column was washed with 5 column volumes of 20 mM Tris(pH 8.5), 5% glycerol, 100 mM NaCl, 0.1 mM EDTA, 0.05% Triton X-100 and0.05% Tween-20 (KTA buffer). The protein was then eluted with a 0.1 to1.0M NaCl linear gradient. The polymerase eluted at 0.8M NaCl. Theeluted Thermus thermophilus, strain 1b21 polymerase was concentrated andthe buffer exchanged using a Millipore concentration filter (30 kD Mwtcutoff). The concentrated protein was stored at in KTA buffer (no salt)plus 50% glycerol at −20° C. The activity of the polymerase was measuredusing a salmon sperm DNA radiometric activity assay.

REFERENCES

-   Tabor S., & Richardson C. C. A single residue in DNA polymerases of    the Escherichia coli DNA polymerase I family is critical for    distinguishing between deoxy- and dideoxyribonucleotides. Proc Natl    Acad Sci USA. 1995. Vol. 92(14): 6339-43.-   Sawano A. & Miyawaki A. Directed evolution of green fluorescent    protein by a new versatile PCR strategy for site-directed and    semi-random mutagenesis. Nucleic Acids Res. 2000. Vol. 28 (16): E78.-   Kwon S. T., Kim J. S., Park J. H., Kim H. K., Lee D. S. Cloning and    analysis of the DNA polymerase-encoding gene from Thermus    caldophilus GK24. Mol. Cells. 1997. Vol. 7 (2): 264-71.-   U.S. Pat. No. 5,455,170 to Abramson et al.

What is claimed:
 1. An isolated nucleic acid encoding a Thermusthermophilus strain RQ-1 (DSM catalog number 9247) nucleic acidpolymerase.
 2. The isolated nucleic acid of claim 1 wherein the nucleicacid polymerase is a DNA polymerase.
 3. An isolated nucleic acidencoding a nucleic acid polymerase comprising any one of amino acidsequences SEQ ID NO:13-24.
 4. An isolated nucleic acid encoding aderivative nucleic acid polymerase comprising any one of amino acidsequences SEQ ID NO:13-15 having a mutation that decreases 5′-3′exonuclease activity.
 5. The isolated nucleic acid of claim 4, whereinthe derivative nucleic acid polymerase has decreased 5′-3′ exonucleaseactivity relative to a nucleic acid polymerase comprising any one ofamino acid sequences SEQ ID NO:13-15.
 6. An isolated nucleic acidencoding a derivative nucleic acid polymerase comprising any one ofamino acid sequences SEQ ID NO:13-15 having a mutation that reducesdiscrimination against dideoxynucleotide triphosphates.
 7. The isolatednucleic acid of claim 6, wherein the derivative nucleic acid polymerasehas reduced discrimination against dideoxynucleotide triphosphatesrelative to a nucleic acid polymerase comprising any one of amino acidsequences SEQ ID NO:13-15.
 8. An isolated nucleic acid encoding anucleic polymerase comprising any one of SEQ ID NO:1-12.
 9. An isolatednucleic acid comprising a nucleotide sequence complementary to any oneof SEQ ID NO:1-12.
 10. A vector comprising an isolated nucleic acidencoding a Thermus thermophilus strain RQ-1 (DSM catalog number 9247)nucleic acid polymerase.
 11. The vector of claim 10 wherein the nucleicacid polymerase is a DNA polymerase.
 12. A vector comprising an isolatednucleic acid encoding a nucleic acid polymerase comprising any one ofamino acid sequences SEQ ID NO:13-24.
 13. A vector comprising anisolated nucleic acid encoding a derivative nucleic acid polymerasecomprising any one of amino acid sequences SEQ ID NO:13-15 having amutation that decreases 5′-3′ exonuclease activity.
 14. The vector ofclaim 13, wherein the derivative nucleic acid polymerase has decreased5′-3′ exonuclease activity relative to a nucleic acid polymerasecomprising any one of amino acid sequences SEQ ID NO:13-15.
 15. A vectorcomprising an isolated nucleic acid encoding a derivative nucleic acidpolymerase comprising any one of amino acid sequences SEQ ID NO:13-15having a mutation that reduces discrimination against dideoxynucleotidetriphosphates.
 16. The vector of claim 15, wherein the derivativenucleic acid polymerase has reduced discrimination againstdideoxynucleotide triphosphates relative to a nucleic acid polymerasecomprising any one of amino acid sequences SEQ ID NO:13-15.
 17. A vectorcomprising an isolated nucleic acid encoding a nucleic polymerasecomprising any one of SEQ ID NO:1-12.
 18. An expression vectorcomprising a promoter operably linked to an isolated nucleic acidencoding a Thermus thermophilus strain RQ-1 (DSM catalog number 9247)nucleic acid polymerase.
 19. The expression vector of claim 18 whereinthe nucleic acid polymerase is a DNA polymerase.
 20. An expressionvector comprising a promoter operably linked to an isolated nucleic acidencoding a nucleic acid polymerase comprising any one of amino acidsequences SEQ ID NO:13-24.
 21. An expression vector comprising apromoter operably linked to an isolated nucleic acid encoding aderivative nucleic acid polymerase comprising any one of amino acidsequences SEQ ID NO:13-15 having a mutation that decreases 5′-3′exonuclease activity.
 22. The expression vector of claim 21, wherein thederivative nucleic acid polymerase has decreased 5′-3′ exonucleaseactivity relative to a nucleic acid polymerase comprising any one ofamino acid sequences SEQ ID NO:13-15.
 23. An expression vectorcomprising a promoter operably linked to an isolated nucleic acidencoding a derivative nucleic acid polymerase comprising any one ofamino acid sequences SEQ ID NO:13-15 having a mutation that reducesdiscrimination against dideoxynucleotide triphosphates.
 24. Theexpression vector of claim 23, wherein the derivative nucleic acidpolymerase has reduced discrimination against dideoxynucleotidetriphosphates relative to a nucleic acid polymerase comprising any oneof amino acid sequences SEQ ID NO:13-15.
 25. An expression vectorcomprising a promoter operably linked to an isolated nucleic acidencoding a nucleic polymerase comprising any one of SEQ ID NO:1-12. 26.A host cell comprising an isolated nucleic acid encoding a Thermusthermophilus strain RQ-1 (DSM catalog number 9247) nucleic acidpolymerase.
 27. The host cell of claim 26 wherein the nucleic acidpolymerase is a DNA polymerase.
 28. A host cell comprising an isolatednucleic acid encoding a nucleic acid polymerase comprising any one ofamino acid sequences SEQ ID NO:13-24.
 29. A host cell comprising anisolated nucleic acid encoding a derivative nucleic acid polymerasecomprising any one of amino acid sequences SEQ ID NO:13-15 having amutation that decreases 5′-3′ exonuclease activity.
 30. The host cell ofclaim 29, wherein the derivative nucleic acid polymerase has decreased5′-3′ exonuclease activity relative to a nucleic acid polymerasecomprising any one of amino acid sequences SEQ ID NO:13-15.
 31. A hostcell comprising an isolated nucleic acid encoding a derivative nucleicacid polymerase comprising any one of amino acid sequences SEQ IDNO:13-15 having a mutation that reduces discrimination againstdideoxynucleotide triphosphates.
 32. The host cell of claim 31, whereinthe derivative nucleic acid polymerase has reduced discriminationagainst dideoxynucleotide triphosphates relative to a nucleic acidpolymerase comprising any one of amino acid sequences SEQ ID NO:13-15.33. A host cell comprising an isolated nucleic acid encoding a nucleicpolymerase comprising any one of SEQ ID NO:1-12.
 34. An isolated nucleicacid polymerase from Thermus thermophilus strain RQ-1 (DSM catalognumber 9247).
 35. The isolated nucleic acid polymerase of claim 34wherein the nucleic acid polymerase is a DNA polymerase.
 36. An isolatednucleic acid polymerase comprising any one of amino acid sequences SEQID NO:13-24.
 37. An isolated nucleic acid polymerase comprising any oneof amino acid sequences SEQ ID NO:13-15 having a mutation that decreases5′-3′ exonuclease activity.
 38. The isolated nucleic acid polymerase ofclaim 37, wherein the derivative nucleic acid polymerase has decreased5′-3′ exonuclease activity relative to a nucleic acid polymerasecomprising any one of amino acid sequences SEQ ID NO:13-15.
 39. Anisolated nucleic acid polymerase comprising any one of amino acidsequences SEQ ID NO:13-15 having a mutation that reduces discriminationagainst dideoxynucleotide triphosphates.
 40. The isolated nucleic acidpolymerase of claim 31, wherein the derivative nucleic acid polymerasehas reduced discrimination against dideoxynucleotide triphosphatesrelative to a nucleic acid polymerase comprising any one of amino acidsequences SEQ ID NO:13-15.
 41. A kit comprising a container containing anucleic acid polymerase, wherein the nucleic acid polymerase comprisesany one of amino acid sequences SEQ ID NO:13-24.
 42. The kit of claim 41further comprising a container containing an unlabeled nucleotide, alabeled nucleotide, a balanced mixture of nucleotides, a chainterminating nucleotide, a nucleotide analog, a buffer solution, asolution containing magnesium, a cloning vector, a restrictionendonuclease, a sequencing primer, a solution containing reversetranscriptase, or a DNA or RNA amplification primer.
 43. The kit ofclaim 41, adapted for performing DNA sequencing, DNA amplification,reverse transcription, RNA amplification or primer extension.
 44. Amethod of making a nucleic acid polymerase comprising any one of SEQ IDNO:13-24 that comprises, incubating under conditions sufficient for RNAtranscription and translation a host cell comprising a nucleic acidencoding a polypeptide comprising any one of SEQ ID NO:13-24 operablylinked to a promoter.
 45. The method of claim 44 wherein the nucleicacid comprises any one SEQ ID NO:1-12.
 46. A nucleic acid polymerasemade by the method of claim
 44. 47. A method of synthesizing DNAcomprising contacting a polypeptide comprising any one of SEQ IDNO:13-24 with a DNA under conditions sufficient to permit polymerizationof DNA.
 48. A method for thermocyclic amplification of nucleic acidcomprising: (a) contacting a nucleic acid with a thermostablepolypeptide having any one of SEQ ID NO: 13-24 under conditions suitablefor amplification of said nucleic acid; and (b) amplifying the nucleicacid.
 49. The method of claim 48 wherein the thermocyclic amplificationof the nucleic acid includes cycles of denaturation, primer annealingand primer extension.
 50. The method of claim 48 wherein thethermocyclic amplification of the nucleic acid is performed by StrandDisplacement Amplification.
 51. The method of claim 48 whereinthermocyclic amplification of the nucleic acid is performed byPolymerase Chain Reaction.
 52. A method of primer extension comprisingcontacting a polypeptide comprising any one of SEQ ID NO:13-24 with aprimer and a nucleic acid that is complementary to the primer underconditions sufficient to permit polymerization of DNA.
 53. The method ofclaim 52 wherein the nucleic acid is DNA.
 54. The method of claim 52wherein the primer extension is done to sequence the nucleic acid. 55.The method of claim 52 wherein the primer extension is done to amplifythe nucleic acid.